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Abstract 

This  paper  introduces  object-oriented  access  controls 
(OOAC)  as  a  result  of  consequently  applying  the 
object-oriented  paradigm  for  providing  access 
controls  in  object  and  interoperable  databases.  OOAC 
includes:  (1)  subjects,  like  users,  roles  etc.,  are 
regarded  as  first-class  objects,  (2)  objects  are 
accessed  by  sending  messages,  and  (3)  access  controls 
deal  with  controlling  the  flow  of  messages  among 
objects.  OOAC  are  not  intended  to  replace  legacy 
access  control  mechanisms  which  mainly  have  been 
designed  and  applied  in  non-object  environments. 
Instead,  they  provide  the  basis  for  applying  these 
concepts  in  true  object-oriented  environments.  An 
object  authorizjation  language  (OAL)  is  proposed  for 
specifying  authorizations  in  a  declarative  manner.  We 
illustrate  the  feasibility  of  the  proposed  concepts  in 
applying  them  to  IRO-DB  II,  an  extension  of  the 
database  federation  IRO-DB,  that  provides 
interoperable  access  between  relational  and  object- 
oriented  database  systems  on  the  world-wide-web. 


Keywords 

security  policy,  object-orientation,  access  controls, 
interoperability 

1  Introduction 

The  importance  of  object-oriented  database  systems 
increased  dramatically  within  the  commercial 
database  market  in  the  last  few  years.  Especially,  new 
application  domains,  like  multimedia  or  interoperable 
environments,  illustrate  the  feasibility  and  usefulness 
of  object-orientation.  Concerning  access  controls, 
most  of  the  existing  models,  for  instance  discretionary 
access  controls  (DAC),  role-based  access  controls 
(RBAC),  or  mandatory  access  controls  (MAC),  have 
been  originally  designed  for  relational  database 
systems.  However,  the  application  of  these  models  in 
object-oriented  systems  can  not  be  straight  forward 
(compare  [6]).  Particular  object-oriented 
characteristics,  like  object  identity,  encapsulation, 
inheritance,  polymorphism,  and  complex  objects  (see 
[1])  require  the  integration  of  new  mechanisms  to 
legacy  access  control  concepts. 
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A  remarkable  amount  of  research  has  been 
devoted  to  extending  access  control  concepts  for 
object  environments  (see  section  1.1).  All  this  work 
offers  the  possibility  to  understand  the  challenges  and 
research  issues  concerning  access  controls  within 
object-oriented  systems.  However,  we  see  the  need  for 
a  common  ground  to  start  from  in  order  to  develop 
true  object-oriented  access  controls.  This  is  mainly 
because  of  two  reasons:  (1)  to  the  best  of  our 
knowledge,  none  of  the  proposed  extensions  to  legacy 
access  controls  consider  subjects  (e.g.  users)  to  be 
first-class  objects  within  an  object  database.  Instead, 
subjects  are  treated  as  "extra-terrestrial"  entities  that 
are  completely  separated  firom  the  data  objects.  In 
consequence,  object-oriented  features  can  not  be 
applied  to  subjects,  respectively,  access  control 
concepts  can  not  be  applied  within  pure  data  object 
communications.  On  the  other  hand,  (2)  most  of  the 
proposed  extensions  still  regard  access  types  as  a  set 
of  elementary  actions  (like  read,  write,  execute,  etc.). 
This  attitude  ignores  messages  to  be  the  means  of 
communication  within  an  object  system  and 
dramatically  limits  the  expressiveness  of  the  security 
model.  The  set  of  messages  an  object  is  able  to 
respond  to  (i.e.  the  object’s  interface)  exactly  defines 
the  ways  an  object  might  be  accessed;  a  set  of 
elementary  actions  can  never  be  complete  for  the 
great  variety  of  application  domains. 

The  remainder  of  the  paper  is  structured  as 
follows:  section  2  summarizes  basic  terminology  and 
provides  an  overview  about  the  database  federation 
IRO-DB  11.  Section  3  introduces  the  concepts  used  by 
object-oriented  access  controls.  It  concentrates  on  the 
object  activity  stack,  proposes  an  object  authorization 
language,  and  specifies  policies  concerning 
authorization  and  access  control  within  OOAC. 
Finally,  section  4  concludes  and  provides  directions 
for  future  research  effons. 

1.1  Related  Work 

In  [3],  the  authors  present  an  authorization  model  for 
next-generation  database  systems  supporting  object- 
oriented  concepts  as  well  as  semantic  data  modeling 
concepts.  Special  effort  is  given  to  the  development  of 
computing  implicit  authorizations  firom  a  set  of 
explicitly  defined  authorizations,  [5]  first  suggests  to 


specify  privileges  for  users  to  execute  methods  on 
objects.  The  authorization  model  enforces  the  concept 
of  private  and  protected  methods.  [14]  presents  a 
method-based  authorization  model  as  well  as 
algorithms  that  evaluate  the  proposed  authorization 
policies  concerning  generalization,  aggregation, 

relationships,  abstract  classes  and  indirect  method- 
calls.  [6]  summarizes  the  issues  arising  within 
discretionary  authorizations  for  object  bases.  The 
authors  provide  suggestions  how  to  successively 
extend  a  basic  authorization  model  with  object- 
oriented  features  concerning  access  types,  security 
subjects  and  objects,  access  controls,  and 

authorization.  [16]  concentrates  on  the  evaluation 
algorithms  deciding  whether  an  access  should  be  fully 
or  paitially  granted/denied.  The  algorithms  are 
discussed  for  compiled-time  as  well  as  run-time 
evaluation.  [15]  develops  an  authorization  model  for 
object-oriented  databases.  The  model  contains  user 
access  conu-ols  and  administration  of  authorizations. 
It  consists  of  a  set  of  policies,  a  structure  for 
authorization  rules  and  algorithms  to  evaluate  access 
requests  against  the  authorization  rules.  [17]  discusses 
means  for  object-oriented  environments  to  reduce  the 
number  of  explicit  authorizations  to  the  number  of 
meaningful  authorizations.  In  this  work  the  concept  of 
compound  subjects  is  introduced  in  order  to  be  able  to 
model  certain  real  world  security  requirements  (i.e. 
the  four  eyes  principle).  [7]  provides  another  early  but 
important  contribution  to  authorizations  in  object- 
oriented  environments.  [11]  discusses  role-based 
access  controls  in  an  interoperable  environment, 
including  object-oriented  as  well  as  relational 
database  systems. 

2  Basic  Terminology,  IRO-DB  Overview 

The  basic  characteristics  that  are  common  to  every 
object-oriented  system  can  be  summarized  as  follows 
(compare  [1],  or  [4]): 

•  object:  a  real-life  entity  having  structure,  state  and 
behavior.  Each  object  is  associated  with  an  object 
identifier  (oid)  that  is  unique  and  fixed  for  the 
whole  life  of  the  object. 

•  structure  and  state:  each  object  is  associated  with 
a  set  of  attributes  that  specify  the  structure  of  an 
object.  The  attribute  values  being  themselves 
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objects  determine  the  state  of  an  object  at  any 
time. 

•  behavior,  each  object  is  associated  with  a  set  of 
methods  specifying  an  object’s  behavior.  A 
method  consists  of  a  method  name,  a  signature 
(the  names  and  types  of  parameters  and  the  result 
type)  and  an  implementation  (a  piece  of 
executable  code)  which  can  be  invoked  by  a 
message  that  matches  a  particular  method  name 
and  signature.  Some  systems  allow  a  method  to  be 
overloaded  meaning  that  another  method  of  that 
object  has  the  same  name  but  different  signature. 

•  encapsulation:  the  attributes  of  an  object  can  only 
be  accessed  by  sending  messages  to  the  object 
resulting  in  the  execution  of  a  method 
corresponding  to  the  particular  message.  The  set 
of  messages  an  object  can  respond  to  is  called  the 
object’s  interface.  The  execution  of  a  method  may 
imply  the  object  sending  messages  to  itself  or  to 
some  other  objects. 

•  class:  a  template  for  a  set  of  objects  sharing  the 
same  structure  and  behavior.  Additionally,  classes 
serve  as  an  object  factory  (create  objects  of  a  class 
by  sending  the  new  message)  and  an  object 
warehouse  (maintaining  the  extent,  i.e.  the  set  of 
objects  that  are  instances  of  a  class). 

•  inheritance:  classes  can  have  more  specialized 
sub-classes  and  more  general  super-classes  with 
the  purpose  that  all  sub-classes  inherit  the  super¬ 
class’  structure  and  behavior.  Sub-classes  may 
define  additional  attributes  and  methods  or  may 
override  inherited  methods  with  the  impact  that 
objects  of  different  levels  of  an  inheritance 
hierarchy  have  methods  with  the  same  name  and 
signature  but  different  behavior. 

•  complex  objects:  objects  that  are  built  from 
simpler  ones  by  applying  complex  object 
constructors  to  them  including  sets^  lists,  bags, 
arrays,  tuples,  and  the  like. 

Any  object-oriented  database  system  has  to  provide 
the  features  and  characteristics  listed  above  combined 
with  extensibility  (no  distinction  in  usage  between 
system  defined  and  user  defined  types), 
computational  completeness  (ability  to  express  any 
computational  function),  and  typical  database  features 
like  persistence,  secondary  storage  management, 
recovery  mechanisms,  and  ad  hoc  query  facilities.  In 


this  paper  we  do  not  concentrate  on  the  optional  and 
open  features  of  an  object-oriented  database  system  as 
mentioned  in  [1]  since  these  characteristics  (e.g. 
multiple  inheritance,  versions,  etc.)  still  differ 
significantly  among  current  object  databases. 

2.1  The  IRO-DB  n  Database  Federation 


Figure  1 :  The  IRO-DB  architecture. 

IRO-DB  (interoperable  relational  and  object-oriented 
databases)  is  a  European  ESPRIT  project 
implementing  a  database  federation  (compare  [20]).  It 
uses  a  three  layered  architecture  (see  Figure  1)  having 
a  local,  a  communication,  and  an  interoperable  |ayer. 
A  common  object-oriented  data  model  (ODMG,  see 
[2])  is  used  throughout  the  layers  in  order  to  provide 
interoperability.  The  model  defines  an  object 
definition  language  to  declare  the  interfaces  to  object 
types,  an  object  query  language  to  formulate  database 
queries,  and  an  object  manipulation  language  to 
retrieve  and  manipulate  database  objects  within  a 
programming  language  (e.g.  C++).  All  component 
databases  at  the  local  layer  of  IRO-DB  implement  a 
so  called  local  database  adapter  (LDA)  which  makes 
the  local  system  appear  as  if  it  was  an  ODMG 
compliant  database.  As  shown  in  Figure  1,  each  LDA 
exports  its  local  schema  or  parts  of  it  as  an  export 
schema  using  remote  object  access  services  of  the 
communication  layer.  The  various  export  schemata 
are  imported  at  the  interoperable  layer  (import 
schema)  and  integrated  into  an  interoperable  schema 
using  derived  classes  (compare  [8]).  These  kind  of 
classes  provide  federated  views  on  the  classes  of  the 


import  schema  with  a  unique  semantic  for  all  object 
types  and  methods. 

The  authorization  model  specified  within  IRO-DB 
provides  a  high  level  of  security.  It  combines 
ownership  of  data  with  centralized  administration  of 
security  issues  by  using  role-based  access  controls. 
Furthermore,  powerful  concepts  like  negative  and 
implied  authorization  (see  [12])  are  used  to  provide 
enough  flexibility  for  integrating  heterogeneous 
authorization  models.  The  security  architecture  of 
IRO-DB  (see  [13])  allows  to  specify  a  global  security 
policy  in  order  to  prevent  global  security  risks  like 
multi-site  aggregation  of  data  and  allows  to  integrate 
any  number  of  heterogeneous  local  security  policies 
for  ensuring  local  autonomy. 

IRO-DB  II  will  be  an  extension  of  IRO-DB 
applying  the  developed  database  federation  within  the 
world-wide-web.  An  ODMG/JAVA  binding  is 
currently  under  development  which  will  replace  the 
ODMG/C-m*  binding  used  in  IRO-DB. 

3  Object-Oriented  Access  Controls 
(OOAC) 

In  this  section  we  will  introduce  the  concepts 
developed  for  OOAC  as  the  basic  access  control 
policy  of  IRO-DB  II. 

3.1  Prerequisites  for  OOAC 

OOAC  requires  an  object  database  to  provide  a  class 
hierarchy  having  a  distinct  origin  class  and  a  sub¬ 
hierarchy  of  objects  that  could  be  referenced  by  name. 
Since  not  every  object  should  be  an  issue  for  access 
controls  OOAC  assume  that  only  access  to  named 
objects  is  being  controlled.  Some  of  the  object 
databases  explicitly  distinguish  between  persistent 
objects  (objects  that  survive  the  process  within  they 
have  been  created)  and  transient  objects  (objects  that 
live  only  for  the  execution  time  of  the  process  within 
they  have  been  created).  Some  others  implement  the 
concept  of  persistency  by  reference  (an  object  is 
persistent  as  long  as  it  is  referenced  by  at  least  one 
other  object).  Anyway,  only  persistent  objects  should 
be  an  issue  for  access  controls  since  they  are  the  asset 
to  protect  while  transient  objects  ’die’  after  the  scope 
of  a  process.  In  this  paper  we  use  the  term  protection 


object  for  objects  that  could  be  named  and  should  be 
an  issue  for  access  controls. 

The  set  of  classes  offered  by  an  object  database 
can  be  grouped  into  several  schemata,  for  instance,  a 
basic  schema,  a  security  sub-schema,  an  application 
domain  sub-schema  etc.  The  basic  schema  as  required 
for  OOAC  could  look  like  shown  in 

Figure  2.  It  contains  a  root  class  (Object)  holding 
the  object  identifier  and  offering  methods  to  create 
(n€w)j  delete,  and  copy  objects.  Furthermore,  a  root 
class  for  all  named  objects  (NamedObject)  is  offered, 
providing  methods  to  find  (lookup)  an  object  and  to 
retrieve  or  change  its  name.  Finally,  some  general 
purpose  classes  could  be  provided  that  are 
implementing  common  and  useful  functionality  (e.g. 
collections,  strings,  etc.). 


Figure  2:  A  basic  schema  as  required  by 
OOAC  (OMT  notation,  see  [19]). 

In  a  true  object-oriented  environment  basically 
everything  is  regarded  as  an  object.  Hence,  the  active 
entities  referred  to  as  subjects  in  security  literature 
(e.g.  users,  roles,  etc.)  are  objects,  too.  Figure  3 
illustrates  a  security  sub-schema  that  is  apt  to 
implement  role-based  access  controls.  Class  Subject 
(sub-class  of  NamedObject)  provides  special  behavior 
concerning  activity  (see  section  3.3)  common  to  all 
security  subjects.  Two  kinds  of  subjects  exist  in  the 
case,  namely,  users  and  roles.  Users  have  to  be 
authenticated  (using  a  password  check)  and  can  be 
members  of  several  roles.  A  user  has  to  play  (activate) 
a  role  in  order  to  receive  certain  authorizations.  Roles 
may  be  structured  within  a  role-hierarchy  using  the 
subroles/superrole  relationships. 
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Figure  3:  An  example  for  a  security  sub¬ 
schema  for  role-based  access  controls. 

An  application  sub-schema  contains  classes  relevant 
to  a  particular  application  domain.  In  this  paper,  we 
use  a  simple  example  used  within  the  IRO-DB 
project.  It  focuses  on  a  production  database  that 
maintains  parts  which  are  distributed  over  two  local 
databases. 


Figure  4:  An  example  for  an  application  sub¬ 
schema. 

Basically,  objects  at  the  interoperable  layer  of  IRO- 
DB  may  be  either  native  objects  defined  within  the 
home  database  of  the  interoperable  layer,  may  be 
imported  from  a  local  database,  or  may  be  derived 
from  native,  imported,  or  other  derived  objects. 
Import  objects  additionally  maintain  a  local  object 
identifier  {loid)  that  is  composed  of  the  original  object 
identifier  and  some  local  site  information.  Derived 
objects  additionally  maintain  a  global  object  identifier 
{goid)  that  contains  information  about  the  classes 
from  which  the  object  is  derived.  The  example 
schema  in  Figure  4  shows  two  import  classes,  namely 


S_PART  and  I^PRT  with  their  objects  in  fact  located 
at  different  local  databases.  Both  classes  maintain  a 
part  identifier  (S_PART.partJd,  I_PRT,prtJd)  and  a 
part  description  (S^PARTdescriptiony 

I_PRTprtjdesc)  about  logically  identical  parts. 
However,  class  S_PART  additionally  maintains  an 
update  date  {upd_date)y  and  class  I_PRT  additionally 
holds  the  quantity  (emps_qty)  of  produced  parts.  The 
derived  class  PART  provides  a  global  view  on  both 
import  classes  allowing  to  retrieve  and  change  the  part 
description  as  well  as  the  quantity  of  any  part  stored 
in  the  distributed  local  databases.  The  native  class 
IROApplication  provides  functionality  for  opening 
and  closing  IRO-DB  (openDB,  closeDE),  for 
transaction  management  {beginTA,  commitTA, 
closeTA)  and  for  identification  and  authentication 
{login,  logout)  of  IRO-DB  users. 

3.2  Messages 

As  mentioned  in  section  2,  objects  communicate  in 
sending  messages  to  themselves  or  to  other  objects 
which  react  in  executing  a  particular  method  that 
shows  behavior  in  that  it  may  change  the  state  of  the 
object,  alter  parameter  values  or  provide  a  return 
value.  Object-oriented  access  controls  simply  deal 
with  controlling  the  flow  of  messages  among  objects 
of  an  object  database. 


Figure  5:  Controlling  the  flow  of  messages 
among  objects. 

Figure  5  illustrates  the  message  sending  and  access 
control  mechanisms.  During  the  execution  of  method 
ml  the  source  object  sends  message  msg4  to  the  target 
object  which  is  expected  to  react  in  executing  its 
corresponding  method  m4.  OOAC  intercepts  the 
message  sending  process^  and  applies  access  controls 
in  that  it  evaluates  whether  the  source  object  is 


^  Triggers  could  be  an  adequate  mechanism  to 
intercept  the  sending  of  messages  among  objects. 


allowed  to  send  the  particular  message  to  the  target 
object  (see  section  3.5  for  details  about  the  evaluation 
process).  If  allowed,  the  message  is  sent  and  the  target 
object  executes  the  requested  method.  If  denied,  the 
message  is  blocked  and  an  access  control  exception  is 
raised^.  The  requested  method  is  prevented  from 
executing  and  its  output  parameter  values  as  well  as 
an  optional  result  value  are  set  to  a  distinct  value,  e.g. 
nil.  Each  application  may  catch  the  access  control 
exceptions  and  handle  them  according  to  its  needs. 
Several  nested  method  calls  may  occur  during 
program  execution,  e.g.  objectl  sends  message!  to 
object!  which  in  turn  sends  message!  to  objects  etc. 
which  leads  to  a  stack  of  active  objects  as  explained 
in  the  following  sub-section. 

33  Activity 

OOAC  require  a  system  maintained  activity  stack 
which  usually  corresponds  to  the  stack  of  method 
calls.  Each  time,  an  object  receives  a  message  and 
starts  executing  the  corresponding  method,  the  object 
becomes  active  and  is  pushed  on  to  the  activity  stack. 
All  subsequent  messages  are  regarded  to  be  sent  from 
this  object  until  it  returns  from  executing  the  method 
and  is  popped  again  from  the  stack.  The  most  recently 
activated  object  is  the  basis  for  access  control 
decisions.  If  a  decisions  is  not  possible  the  previously 
activated  object  is  examined  until  the  primarily 
activated  object  is  reached  which  usually  corresponds 
to  a  kind  of  default  system  object.  Instances  of  class 
Subject  or  sub-classes  of  it  are  the  only  objects  that 
have  the  possibility  to  actively  modify  the  activity 
stack  independently  from  the  sequence  of  method 
invocations  in  one  of  the  following  ways: 

•  activate  on-behalf-of.  the  subject  is  additionally 
pushed  on  to  the  activity  stack  which  is  the  normal 
case  like  for  any  other  object  with  the  difference 
that  the  subject  remains  activated  until  it  is 
explicitly  deactivated  (popped  from  the  stack). 

•  activate  instead-ofi  the  subject  replaces  the 
previously  activated  object  in  the  activity  stack 
with  the  consequence  that  the  authorizations  of  the 


replaced  object  do  not  further  influence  access 

control  decisions. 

Table  1  shows  the  changes  to  an  activity  stack  during 
the  execution  of  a  simple  example  application  that 
uses  the  schema  illustrated  in  Figure  4.  The  IRO-DB 
application  (IROApplication[l],  the  characters  Y  and 
T  denote  instantiation,  i.e.  the  object  named  "1”  of 
class  IROApplication)  offers  the  following 
functionality:  (1)  identify  and  authenticate  a  user,  (2) 
let  the  user  play  a  particular  role,  (3)  let  the  user 
choose  a  particular  part,  and  (4)  change  the  part 
description  in  both  of  the  local  databases  from  which 
parts  are  derived. 

After  the  big-bang  (i.e.  the  process  that  represents 
IROApplication[I]  is  loaded  into  memory)  a  default 
system  object  immediately  passes  control  to 
IROApplication[l]  in  sending  the  message  main. 
During  the  execution  of  method  main,  the  application 
first  opens  the  IRO-DB  database  "example"  and  tries 
to  login  a  user,  i.e.  it  identifies  User[7]  and 
authenticates  him/her  using  password  "abc".  The 
method  authenticate  actively  changes  the  activity 
stack  in  that  it  lets  an  authenticated  user  work  on^ 
behalf  of  the  calling  application.  Next,  the  method 
login  identifies  that  the  user  wants  to  play  Role[2]  and 
sends  the  corresponding  message  play  to  the 
particular  role.  The  method  play  actively  changes  the 
activity  stack  again  in  that  it  lets  the  requested  role 
work  instead’Ofth^  active  user.  Third,  the  application 
identifies  that  the  user  wants  to  modify  the  description 
of  PART[15]  to  the  value  "Partl5"  and  sends  the 
corresponding  message  description('Tartl5'*)  to  the 
particular  part  object  which  in  turn  sets  the 
description  attributes  of  both  import  objects 
S_PART[15]  and  LPRT[15]  from  which  PART[15] 
has  been  derived.  After  completing  the  task  the 
application  performs  a  logout,  closes  IRO-DB  and 
exits,  the  corresponding  process  is  terminated.  The 
following  section  goes  into  detail  for  specifying 
authorizations  within  OOAC  and  illustrates  some 
means  to  facilitate  this  process,  for  instance,  using 
template  or  conditional  authorizations. 


^  We  assume  the  existence  of  exceptions  in  the 
object  database  here  since  they  are  the  most  intuitive 
way  to  handle  errors. 


6 


Source  Object 
(Activity  Stack) 

Target  Object 

Message 

(Method  Call  Stack) 

system 

IROApplicationfl] 

main(l,"l") 

IROApplicationfl] 

IROApplicationfl] 

openDBC’example”) 

IROApplicationfl] 

IROAppIicationfH 

login("7","abc","2”) 

IROApplicationfl] 

authenticateC'abc") 

User[7] 

activate(OnBehalf) 

Userf?] 

play 

Role[2] 

Role[21 

activate(InsteadOf) 

Roler2] 

PARTflS] 

descriptionC'PartlS") 

PART[15] 

S_PART[15] 

description.set(”Part  15") 

PART[15] 

LPRTflSl 

prt_desc.set("Part  15”) 

Rolef21 

IROApplicationf  1  ] 

logout 

IROApplicationfl] 

IROApplicationfl] 

closeDB 

IROApplicationfl] 

IROApplicationfl] 

exit 

system 

Table  1:  Changes  within  the  activity  stack  while  executing  an 
example  application. 


3.4  An  Object  Authorization  Language 
(OAL) 

Authorization  in  OOAC  specifies  the  messages  a 
source  object  may  send  to  a  target  object.  In  order  to 
describe  authorizations  in  a  declarative  manner  we 
propose  an  object  authorization  language  (OAL).  The 
authorizations  necessary  for  the  simple  example 
application  mentioned  in  the  previous  section  can  be 
expressed  in  OAL  as  follows: 


ALLOW  system  SENDING  main,  exit  TO 
IROApplication[l] ; 

ALLOW  IROApplication[l]  SENDING 
authenticate  TO  User[7]; 

ALLOW  User[7]  SENDING  play  TO  Role[2]  ; 

ALLOW  Role [2]  SENDING 

description (String)  TO  PART[15]; 

ALLOW  PART [15]  SENDING  Set  TO 
S_PART[15] .description, 

I^PRT[15}  .prt^desc; _  _ 


The  subsequent  sections  describe  the  3  kinds  of 
authorizations  within  OAL,  namely  template, 
conditional,  and  negative  authorizations. 

3.4.1  Template  Authorization 

The  OAL  contains  mechanisms  to  relieve  the 
administrational  effort  of  specifying  authorizations. 
The  characters  **’  (any)  and  (arbitrary)  can  be  used 
to  specify  sets  of  objects  and/or  messages  respectively 
to  specify  an  arbitrary  object  or  message  that  can  be 
referred  to  from  other  points  within  an  authorization. 
The  latter  concept  is  especially  useful  for  conditional 
authorizations  (see  section  3.4.2). 

Using  the  template  characters  and  an  object, 
either  source  or  target,  may  be  specified  in  one  of  the 
ways  shown  in  Figure  6: 
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class 


object 

particular  arbitrary  any 

Figure  6:  Template  object  specifications 
within  OAL. 

Both,  the  X  and  Y  axis  show  the  possible  types  of 
specifications  which  can  be  one  of  particular  (by 
name),  arbitrary  (by  character  V  followed  by  an 
identifier)  or  any  (by  character  The  X  axis 
corresponds  to  object  definitions,  the  Y  axis  to  class 
definitions.  Thus,  a  complete  object  specification  may 
be  a  particular  object  of  a  particular  class 
(AClass[inst]),  an  arbitrary  object  of  a  particular 
class  (AClass[$inst]),  any  object  of  a  particular  class 
(AClass[*]),  an  arbitrary  object  of  an  arbitrary  class 
($AClass[$inst]),  any  object  of  an  arbitrary  class 
($AClass[*]),  and  any  object  of  any  class  (*). 

The  OAL  also  allows  to  specify  template 
messages.  Thus,  a  complete  message  specification 
may  be  a  particular  message  (e.g. 
$msgI(AClass[*],*),  parameters  may  optionally  be 
specified  as  a  set  of  object  specifications  placed 
within  parenthesis  and  separated  with  commata),  an 
arbitrary  message  (SaMessage)  of  the  object’s 
interface,  and  any  message  (*)  of  the  object’s 
interface. 

The  example  authorizations  described  in  the 
previous  sub^section  could  be  made  more  general  in 
using  template  authorizations  like  the  following: 

(1)  ALLOW  system  SENDING  main,  exit  TO 
Application ( *] ; 

(2)  ALLOW  Application [♦]  SENDING 
authenticate  TO  User[*]; 

(3)  ALLOW  User($u]  SENDING  play  TO 
User [$u] .roles [♦] ; 

(4)  ALLOW  Role [2 3  SENDING 
description (String)  TO  PART[*]; 

(5)  ALLOW  PART[$p)  SENDING  *  TO 

PART[$p] .origin [ *3  ; _ 

The  set  of  authorizations  specifies  that  the  default 
system  object  may  send  messages  main  and  exit  to  any 


application  (1)  which  in  turn  may  authenticate  any  of 
the  known  users  (2).  Each  particular  user  may  play 
those  roles  (s)he  is  a  member  of  (3).  Role[2]  may 
change  the  description  of  any  PART  object  (4),  and  a 
particular  PART  object  may  send  any  message  to 
those  objects  the  PART  object  has  been  derived  from, 
i.e.  the  origin  of  the  derived  object  (5). 

3.4.2  Conditional  Authorization 

In  many  cases  it  is  desirable  to  specify  constraints  on 
authorizations,  for  instance,  dependencies  on  attribute 
values,  time,  location,  or  even  on  other  authorizations 
(like  within  the  concept  of  implied  authorization,  see 
for  instance  [3]).  OOAC  allow  to  specify  two  kinds  of 
conditional  authorizations: 

•  dependent  on  object  state,  and 

•  dependent  on  other  authorizations. 

Suppose  the  following: 


In  this  case,  the  user  objects  are  designed  to  have  an 
attribute  holding  an  expiration  date.  The  message 
expirationDate  returns  this  date  which  is  then 
compared  to  the  current  date  (message  now  is  a  class 
member  of  Date  returning  the  current  date).  The 
authorization  specifies  that  any  user  of  the  database  is 
allowed  to  play  RolefSubscriber]  if  the  user's 
validation  date  has  not  expired. 

The  authorization  below  is  dependent  on  another 
authorization  and  realizes  a  mutual  exclusion 
constraint  in  that  Role[2]  is  denied  for  what  Role[l]  is 
authorized  to. 

IP  ALLOWED  Roled]  SENDING  $m  TO 
$AClass[$inst]  THEN 

DENY  Role [2]  SENDING  $m  TO 
$AClass [$inst] ; 

END 

Summarizing,  the  OAL  provides  possibilities  to 
specify  authorizations  that  depend  on  object  state 
and/or  other  authorizations.  The  former  allows  to 
realize  any  kind  of  value  dependency  which  is 
possible  using  the  database  objects  and  messages 


X 


(including  time,  location,  etc.).  The  latter  can  be  used 
to  realize  particular  policies  e.g.  for  the  concept  of 
implied  authorization  or  for  mutually  exclusive  roles. 

3.4.3  Negative  Authorization 

Usually,  authorization  models  allow  to  specify 
positive  authorizations  (permissions)  based  on  a 
closed  world  assumption  saying  that  any  access  is 
denied  unless  explicitly  permitted.  Authorization 
models  have  been  enriched  with  the  concept  of 
negative  authorizations  (prohibitions)  for  more 
flexibility  combined  with  an  open  world  assumption 
(any  access  is  allowed  unless  explicitly  prohibited).  A 
mixture  of  positive  as  well  as  negative  authorizations 
is  feasible  enabling  to  specify  exceptions  from  general 
specifications. 

Within  OOAC,  authorization  may  be  either  based 
on  an  open  or  a  closed  world  assumption. 
Furthermore,  an  access  may  be  either  allowed,  or 
denied,  respectively,  a  previously  allowed/denied 
access  may  be  revoked.  The  sequence  of 
authorizations  is  relevant  with  respect  to  conflicts 
cither  due  to  template  authorizations  or  due  to 
coexistence  of  permissions  and  prohibitions.  Consider 
the  following  sequence  of  authorizations: 


i : )  ALLOW  User [ * ]  SENDING 

description ( )  TO  PART[*]; 

i2)  DENY  User [47]  SENDING 

description ( )  TO  PART[*]; 

(  J )  DENY  User ( * ]  SENDING 

description {String}  TO  PART[*]; 

(4)  ALLOW  User [11]  SENDING 

description (String)  TO  PART[*]; 


The  sequence  specifies  that  principally  any  user  is 
allowed  to  retrieve  the  description  of  PART  objects 
( 1 )  except  for  User[47]  (2).  On  the  other  hand,  any 
user  is  principally  denied  to  change  the  description  of 
PART  objects  (3)  with  the  exception  of  User[l  1]. 

3.5  Authorization  and  Access  Control 

A  set  of  policies  characterizes  the  basic  attitudes  of 
OOAC.  We  try  to  minimize  the  number  of  policies  in 
order  to  emphasize  the  flexibility  offered  by  template 
and  conditional  authorizations.  Nevertheless,  some 
fundamental  policies  are  specified  in  this  sub-section 


which  are  useful  and  are  intended  to  reflect  the 
principle  nature  of  object  orientation. 

3.5.1  Fundamental  Policies 

The  first  policy  has  already  been  mentioned  before.  It 
reduces  the  number  of  access  controls  within  OOAC 
in  providing  a  base  class  NamedObject  which  has  to 
be  sub-classed  for  objects  that  could  be  named  and 
should  be  an  issue  for  access  controls  (i.e.  protection 
objects). 


(protection-object):  access  controls  are 
applied  to  messages  sent  to  named  objects. 

In  case  that  a  non-protection  object  initiates  a 
message  that  object  has  to  work  on^behalf  of  a 
protection  object  in  order  to  be  possibly  authorized 
for  the  particular  message.  Any  message  that  is  sent  to 
a  non-protection  object  can  only  be  controlled  at  the 
programming  language  level. 

The  second  fundamental  policy  serves  the 
principle  of  encapsulation  in  that  it  lets  any  object  use 
its  own  interface,  freely. 


(object-interface):  any  object  is  allowed  to 
send  any  possible  message  to  itself. _ 


This  policy  can  also  be  disabled  if  desired  in  using  the 
following  template  authorization:  "DENY 

$AClass[$inst]  SENDING  The  negative 

authorization  does  not  specify  a  target  object  which  is 
then  supposed  to  be  equal  to  the  source  object.  It  says 
that  an  arbitrary  object  of  an  arbitrary  class  must  not 
send  messages  (defined  for  that  class  or  inherited,  see 
inheritance  1  and  2  policies  below)  to  itself. 

3.5.2  Object/Class  Methods 

A  method  may  either  be  defined  for  individual 
objects  or  for  all  objects  of  a  class  at  once.  Object 
methods  are  the  regular  case  and  can  be  executed 
independently  on  any  of  the  class*  objects. 
Authorizations  for  object  methods  may  include  an 
instantiation  clause  (’’[...]")  and  thus  be  specified  for 
individual  objects.  A  class  method  can  only  be 
executed  on  all  objects  of  a  class  at  once,  not 
independently  on  a  particular  object  of  a  class.  An 
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example  for  a  class  method  taken  from  the  basic 
schema  illustrated  in 

Figure  2  is  NamedObjectJookup  which  takes  a 
name  and  returns  the  corresponding  object  from 
within  the  extent  of  the  regarded  class,  e.g. 
User.lookupCuserr*)  returns  the  instance  of  class 
User  that  is  named  "userr’  (the  expression  is  equal  to 
User[userl]  within  the  OAL).  Consequently,  an 
authorization  for  a  class  method  does  not  allow  to 
specify  an  instantiation  clause  for  the  target  object 
within  the  OAL.  However,  not  all  of  the  object 
systems  distinguish  between  object-  and  class- 
methods, 

3.5.3  Overloading 

Overloaded  methods  have  the  same  name  but  different 
signatures  within  one  class.  Examples  for  overloaded 
methods  are  name()  and  name(String)  of  class 
NamedObject  again  taken  from  the  basic  schema.  The 
former  returns  the  name  of  an  object,  the  latter  allows 
to  change  the  name  of  an  object.  Overloaded  methods 
are  distinguished  using  their  signature,  i.e.  the  number 
and  types  of  parameters  as  well  as  the  type  of  an 
optionally  returned  result.  Note,  that  most  object 
systems  ignore  the  result  type  due  to  technical  issues. 

3.5.4  Inheritance 

Methods  may  be  inherited,  that  is,  specified  and 
implemented  in  a  direct  or  indirect  super-class  of  the 
regarded  class.  With  respect  to  access  controls,  all 
methods  that  are  inherited  by  an  object  are  regarded 
as  components  of  the  object’s  interface  which  leads  to 
the  following  two  policies: 


(inheritance  1):  authorizations  may  be 
specified  for  messages  that  are  inherited  from 
super-classes  of  the  target  object’s  class. 

Thus,  an  authorization  like  "ALLOW  User[l] 
SENDING  nameO  TO  DerivedObject[*]”  could  be 
specified  to  allow  User[l]  to  retrieve  the  name  of 
derived  objects,  for  instance,  although  the  particular 
method  is  defined  and  implemented  in  class 
NamedObject. 


(inheritance  2):  an  authorization  to  send  any 
message  defined  for  object  o  includes  those 
messages  that  are  inherited  from  any  of  the 
super-classes  of  o.  I 


For  instance,  the  authorization  "ALLOW  User[I] 
SENDING  *  TO  PART[1]"  allows  User[l]  to  send 
any  message  defined  for  class  PART  (i.e.  description, 
and  quantity)  as  well  as  any  message  inherited  from 
the  super-classes  (i.e.  name,  lookup,  new,  copy,  and 
delete)  to  PART[1]. 

We  do  not  want  to  go  into  detail  for  multiple 
inheritance  since  this  is  an  optional  feature  of  object 
databases  (compare  [1]).  Nevertheless,  if  an  object 
system  supports  multiple  inheritance,  OOAC  has  to 
deal  with  the  possibility  of  ambiguous  messages  (i.e. 
messages  that  cannot  clearly  define  which  method  of 
the  super-classes  has  to  be  executed).  Furthermore,  an 
object  may  combine  methods  inherited  from  both, 
protection  objects  and  non-protection  objects,  which 
requires  to  additionally  control  some  messages  sent  to 
non-protection  objects. 

3.5.5  Polymorphism 

Inherited  methods  may  be  overridden,  i.e.  the 
implementation  of  the  method  may  be  changed  or 
extended  without  changing  the  name  and  signature  of 
the  method.  The  method  that  should  be  executed  can 
be  determined  by  examining  the  type  of  the  regarded 
object.  If  the  decision  depends  on  the  dynamic  type  of 
an  object  the  concept  is  also  called  late  binding,  since 
the  dynamic  type  can  only  be  determined  at  run-time. 
The  following  two  policies  dealing  with  the 
polymorphism  property  of  objects  can  thus  be  stated: 


(polymorphism  1):  an  authorization  to  send 
message  m  to  object  o  additionally  holds  for 
any  possible  form  (polymorphism)  of  o  that 
can  respond  to  m.  j 


For  instance,  the  authorization  "ALLOW  User[l] 
SENDING  name()  TO  PART[1]”  additionally  allows 
User[l]  to  send  nameQ  to  PART[1]  in  the  form  of  a 
DerivedObject  or  a  NamedObject  instance  since  these 
super-classes  of  PART  can  respond  to  message 
name(  ). 
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(polymoqjhism  2):  an  authorization  to  send 
message  m  to  any  object  that  is  an  instance  of 
a  particular  class  C  include  those  objects  that 
are  instances  of  sub-classes  of  C  which  thus 
could  be  instances  of  C  as  well  (due  to 
polymorphism). 

The  authorization  "ALLOW  User[l]  SENDING 
nameO  TO  DerivedObject[*]"  allows  User  [1]  to  send 
message  name()  to  any  object  that  is  an  instance  of 
class  DerivedObject  (which  in  this  case  will  be  an 
empty  set  since  DerivedObject  is  an  abstract  class). 
Additionally  the  message  name()  may  be  sent  to  those 
objects  that  could  be  instances  of  class 
DerivedObject,  i.e.  that  are  instances  of  one  of  the 
sub-classes  of  DerivedObject,  namely  PART. 

3.5.6  Complex  Objects 

As  mentioned  above,  complex  objects  are  objects 
that  are  composed  of  simpler  ones.  In  this  paper,  we 
assume  an  object  system  that  completely  encapsulates 
the  attributes  of  an  object,  that  is,  the  attributes  can 
only  be  accessed  by  other  objects  in  sending  messages 
that  can  be  controlled.  Only  the  object  itself  is 
allowed  to  manipulate  its  attributes  directly.  Each 
attribute  value  in  fact  is  a  reference  to  an  object.  Some 
objects  are  collections  and  can  be  used  to  aggregate 
other  objects  (e.g.  Collection,  shown  in 

Figure  2).  The  concept  is  orthogonal  in  the  sense 
that  a  collection  may  hold  any  object  and  thus  may 
hold  other  collections,  too. 

In  [12],  we  proposed  some  policies  concerning 
implied  authorization  for  components  of  complex 
objects,  saying  that  if  a  subject  (e.g.  a  user)  is 
authorized  to  access  a  complex  object,  the  subject 
should  be  implicitly  authorized  to  access  each 
component  of  the  object  and  thus  the  complex  object 
as  a  whole.  OOAC  do  not  use  any  automatic 
authorization  mechanisms.  Instead,  each  object  has  to 
be  authorized  for  the  actions  it  wants  to  execute  which 
is  orthogonal  for  subjects  since  they  are  themselves 
objects  in  OOAC.  For  retrieving  the  name  of  a 
derived  PART  object,  for  instance,  the  PART  object 
has  to  be  authorized  to  retrieve  the  name  of  any 
(import)  object  the  PART  has  been  derived  from.  An 
adequate  authorization  in  OAL  could  be  formulated  as 
"ALLOW  PART[$p]  SENDING  name()  TO 


PART[$p].origin[*]"  authorizing  an  arbitrary  PART 
object  to  retrieve  the  name  of  its  origin  objects. 

Another  problem  is  that  some  object  systems 
provide  types  that  are  system  defined  (e.g.  int,  float, 
etc.)  and  do  not  represent  first-class  objects. 
Furthermore,  attributes  might  be  declared  public,  that 
is,  they  may  be  directly  accessed  by  other  objects 
without  using  the  object’s  interface.  Assume  the 
attribute  description  of  class  S^PART  to  be  a  public 
String.  Since  strings  are  non-protection  objects  (see 

Figure  2)  the  policy  concerning  protection  objects 
has  to  be  extended  as  follows: 


(protection-object^:  access  controls  are 
applied  to  messages  sent  to  protection  objects 
and  to  those  non-protection  objects  that  are 
declared  as  public  parts  of  a  protection 
object. 

The  following  authorization  could  be  specified  which 
allows  User[l]  to  assign  descriptions  to  any  object  of 
the  origin  of  PART[1]: 


ALLOW  User[l]  SENDING  set  TO 

PART[1] . origin [*] .description; 


System  defined  types  that  are  declared  as  public 
attributes  are  handled  by  OOAC  as  quasi  objects 
having  the  minimum  interface  get  (the  attribute  value) 
and  set  (the  attribute  value)  which  allows  to  specify 
authorizations  for  these  two  messages. 

4  Conclusions  and  Future  Work 

In  this  paper  we  introduced  a  new  concept  for  access 
controls  that  is  especially  tailored  for  true  object- 
oriented  environments.  The  concept  called  object- 
oriented  access  controls  (OOAC)  is  based  upon  the 
following  assumptions:  (1)  everything  within  the 
object-oriented  environment  is  regarded  as  an  object, 
(2)  thus,  security  subjects  (e.g.  users,  roles,  etc.)  are 
regarded  as  first-class  objects,  too,  (3)  messages  are 
the  only  means  for  communicating  to  other  objects. 

In  consequence,  OOAC  deal  with  controlling  the 
flow  of  messages  among  the  objects  of  an  object 
database.  An  object  authorization  language  (OAL) 
has  been  proposed  that  allows  to  specify  the  set  of 
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messages  an  object  is  allowed  or  denied  to  send  to 
other  objects  in  a  declarative  manner.  The  OAL 
provides  means  to  specify  template  authorizations 
(relevant  to  a  set  of  objects  and/or  messages), 
conditional  authorizations  (depending  on  object  state 
or  other  authorizations)  and  negative  authorizations 
(denying  access  rather  than  allowing  it).  Furthermore, 
we  presented  a  minimal  set  of  policies  that 
corresponds  to  those  properties  commonly  accepted  to 
be  inherent  to  object-oriented  systems.  The  policies 
address  some  characteristics  concerning  the  object 
interface,  inheritance,  polymorphism,  and  complex 
objects.  On  the  other  hand,  OOAC  do  not  impose  a 
particular  kind  of  access  control  policy  (e.g. 
discretionary,  role-based,  or  mandatory  access 
controls).  Instead,  any  known  policy  or  even  any 
policy  developed  in  future  may  be  implemented  using 
OOAC  since  the  structure  and  behavior  of  database 
objects  exactly  determine  the  ways  an  object  may  be 
accessed  as  well  as  the  ways  an  object  can  be 
protected  against  unauthorized  access.  The  feasibility 
of  OOAC  has  been  demonstrated  in  applying  the 
concept  to  IRO-DB  II,  an  extension  of  the  database 
federation  IRO-DB,  which  provides  interoperable 
access  between  relational  and  object-oriented 
database  systems  within  the  world-wide-web. 

Future  research  efforts  will  concentrate  on 
implementing  ownership-based  (e.g.  DAC),  role- 
based  (RBAC)  or  mandatory  (MAC)  access  controls 
within  OOAC.  Furthermore,  administration  issues  will 
be  addressed  assuming  the  existence  of  meta  classes 
(like  Class,  Attribute,  Message,  Method,  Schema, 
Authorization,  etc.)  as  parts  of  the  application  schema 
allowing  to  apply  OOAC  for  controlling  schema 
modifications  and/or  security  administration  in  the 
same  way  as  controlling  simple  object  access. 
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6  Acronyms 


DAC 

Discretionary  Access  Controls 

IRO-DB 

Interoperable  Relational  and  Object- 
Oriented  Databases 

LDA 

Local  Database  Adapter 

MAC 

Mandatory  Access  Controls 

OAL 

Object  Authorization  Language 

ODMG 

Object  Database  Management  Group 

OOAC 

Object-Oriented  Access  Controls 

RBAC 

Role-Based  Access  Controls 
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Abstract 

This  paper  describes  the  administration  policies 
supported  by  the  MultiPolicy  Authorization  System 
(MPAS).  Several  administration  policies  are  supported 
including  centralized  administration,  decentralized  ad¬ 
ministration  with  delegation  and  transfer,  and  joint 
administration.  In  the  paper  we  first  present  an 
overview  of  the  MPAS  architecture.  We  then  discuss 
the  various  administration  policies  and  formalize  some 
aspects  of  the  proposed  administration  model. 

1  Introduction 

The  introduction  of  an  access  control  system  within 
any  organization  entails  two  main  tasks.  The  first  task 
is  the  identification  and  specification  of  suitable  ac¬ 
cess  control  policies.  An  access  control  policy  estab¬ 
lishes  for  each  user  (or  group  of  users,  or  functional 
role  within  the  organization)  the  actions  the  user  can 
perform  on  which  object  (or  set  of  objects)  within  the 
system  under  which  circumstances.  An  example  of 
access  control  policy  [2,  3]  is  that  “all  programmers 
can  modify  the  project  files  every  working  day  except 
Friday  afternoon”.  The  second  task  is  developing  a 
suitable  access  control  mechanism  implementing  the 
stated  policies.  Because  of  the  richness  of  possible 
access  control  policies,  this  second  task  is  quite  diffi¬ 
cult.  Current  access  control  mechanisms  are  tailored 
to  few,  specific  policies  and  are  unable  to  satisfactorily 
support  access  control  requirements  of  several  applica¬ 
tions.  In  most  cases,  either  the  organization  is  forced 
to  adopt  the  specific  policy  built-in  into  the  access 
control  mechanism  at  hand,  or  access  control  poli¬ 
cies  must  be  implemented  as  application  programs. 
Both  situations  are  clearly  unacceptable.  Many  ad¬ 
vanced  applications,  such  as  workflow  applications, 
and  computer-supported  cooperative  work,  have  artic¬ 
ulate  and  rich  access  control  requirements.  Therefore, 
those  applications  cannot  be  adequately  supported  by 
a  single-policy  access  control  mechanism.  Implement¬ 
ing  access  control  policies  as  application  programs,  on 
the  other  hand,  makes  it  very  difficult  to  verify  and 


modify  the  access  control  policies  and  to  provide  any 
assurance  that  these  policies  are  actually  enforced. 

A  possible  approach  is  to  develop  flexible  access 
control  mechanisms,  able  to  support  different  access 
control  policies  for  possible  different  objects  within  the 
system.  We  refer  to  such  a  system  as  a  multipolicy  ac¬ 
cess  control  mechanism.  Consider  the  classical  closed 
and  open  policies.  Under  the  former,  a  subject  can  ac¬ 
cess  an  object  only  if  an  explicit  positive  authorization 
is  specified.  Under  the  latter,  a  subject  can  access  an 
object  only  if  there  is  no  explicit  denial  (also  called 
negative  authorization) .  It  is  easy  to  encounter  many 
situations  where  some  objects  within  a  system  must  be 
governed  by  the  open  policy  (such  as  data  objects  con¬ 
taining  information  available  to  the  majority  of  users), 
whereas  other  objects  within  the  same  system  need  a 
stricter  control  (such  as  data  objects  containing  infor¬ 
mation  available  to  few,  selected  users),  thus  requiring 
a  closed  policy. 

Several  proposals  in  the  areas  of  database  systems 
and  operating  systems  have  addressed  issues  related 
to  multipolicy  access  control  mechanisms.  Propos¬ 
als  in  the  database  area  include  the  flexible  autho¬ 
rization  model  proposed  in  [5],  and  the  Chassis  sys¬ 
tem  [11]  specifically  addressing  access  control  for  fed¬ 
erated  heterogeneous  database  systems.  Proposals  in 
the  operating  systems  area  include  Trusted  Mach  [7] 
and  DTOS  [9]. 

The  flexible  authorization  model  presented  in  [5], 
developed  as  an  extension  of  the  Orion  authorization 
system  [12]  and  recently  formalized  in  a  logical  frame¬ 
work  [6],  supports  both  positive  and  negative  autho¬ 
rizations.  It,  moreover,  supports  exceptions  and  dif¬ 
ferent  conflict  resolution  policies.  The  Chassis  sys¬ 
tem  provides  different  local  security  monitors,  each 
implementing  the  specific  policy  of  a  site  in  the  federa¬ 
tion.  The  local  monitors  are  complemented  by  a  global 
authorization  layer,  supporting  global  authorization 
policies.  The  local  authorization  policies  can  be  quite 
different.  A  relevant  goal  of  the  Chassis  project  is  how 
to  handle  authorization  conflicts  among  different  site' 
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and  how  to  map  global  authorizations  onto  local  ones. 

All  the  above  proposals,  however,  do  not  provide  so¬ 
phisticated  authorization  2idministration  policies.  Au¬ 
thorization  administration  refers  to  the  function  of 
granting  and  revoking  authorizations.  It  is  the  func¬ 
tion  by  which  authorizations  are  entered  (removed) 
into  (from)  the  access  control  mechanism.  In  most 
of  the  above  approaches,  the  administration  policies 
are  either  based  on  the  centralized  approach,  or  on 
the  ownership  approach  possibly  complemented  with 
the  administration  delegation  approach.  Moreover,  no 
multiple  administrative  policies  within  the  same  sys¬ 
tem  are  supported. 

Two  different  administration  policies  are  briefly 
discussed  by  Fernandez,  Gudes  and  Song  [8]  in  the 
framework  of  an  access  control  mechanism  for  object- 
oriented  database  systems.  The  first  policy  is  based  on 
the  ownership  approach.  The  users  own  the  data  and 
administer  their  data.  Under  the  second  policy,  some 
some  special  users  (administrators)  control  the  access 
to  data.  However,  their  access  control  model  supports 
only  the  second  policy.  By  contrast,  our  model  sup¬ 
ports  both  those  policies,  and  many  others,  so  that 
users  may  choose  the  policy  which  best  suits  their  ap¬ 
plication  requirements. 

Finally,  we  would  like  to  mention  the  brief  prelim¬ 
inary  discussion  reported  in  [15]  addressing  the  use  of 
delegation  and  joint  actions  in  authorization  systems. 
These  mechanisms  are  able  to  support  sophisticated 
authorization  schemes,  particularly  useful  when  deal¬ 
ing  with  complex  information  systems  and  distributed 
systems.  Even  though  the  authors  do  not  address  au¬ 
thorization  administration,  we  believe  that  many  is¬ 
sues  pointed  out  in  the  discussion  are  relevant  to  our 
approach. 

In  this  paper  we  present  the  authorization  admin¬ 
istration  facilities  provided  as  part  of  the  MultiPolicy 
Authorization  System  (MPAS)  being  currently  devel¬ 
oped  at  the  University  of  Milano.  MPAS  supports 
both  the  specification  and  implementation  of  multiple 
authorization  and  administration  policies  by,  at  the 
same  time,  clearly  separating  specification  from  imple¬ 
mentation.  The  system  supports  a  large  variety  of  ad¬ 
ministration  policies  from  centralized  administration, 
either  DBA-^  or  owner-based,  to  joint-based  admin¬ 
istration,  by  which  several  users  are  jointly  respon¬ 
sible  for  authorization  administration.  MPAS  cur¬ 
rently  supports  only  discretionary  access  control  poli¬ 
cies.  The  re2ison  is  that  the  applications  using  dis¬ 
cretionary  policies  are  those  that  usually  require  high 
flexibility.  We  plan,  however,  to  explore  other  types  of 

*  DBA  stands  for  database  administrator. 


access  control  policies,  such  as  the  Chinese  wall  policy. 

The  remainder  of  this  paper  is  organized  as  fol¬ 
lows.  Section  2  briefly  discusses  the  MPAS  architec¬ 
ture.  Section  3  presents  the  various  administrative 
policies  supported  by  MPAS.  Section  4  presents  a  for¬ 
malization  of  some  aspects  of  our  authorization  ad¬ 
ministration  model.  Finally,  Section  5  concludes  the 
paper  and  outlines  future  work. 

2  Architecture  of  MPAS 

The  architecture  of  MPAS  is  illustrated  in  Figure  1. 
The  system  consists  of  two  main  environments:  the 
policy  specification  environment  and  the  run-time  en- 
vironment.  In  discussing  the  architecture,  we  will  cast 
the  discussion  in  terms  of  database  systems,  by  assum- 
ing  that  the  items  to  be  protected  are  data  objects, 
such  as  relations  in  a  relational  DBMS,  or  objects  in 
an  object  DBMS.  However,  we  believe  that  the  discus¬ 
sion  is  also  valid  in  other  contexts. 

2.1  Policy  specification  environment 

The  policy  specification  environment  supports  all 
functions  concerning  policy  specification  for  object  au¬ 
thorization  and  authorization  administration. 

Example  1  Consider  a  table  Public-inf  o  con^ain- 
ing  information  available  to  all  employees  of  a  given 
company.  An  example  of  authorization  policy  is  that 
access  to  table  Public-info  must  be  governed  by  the 
open  policy  [5]f  whereas  an  example  of  administration 
policy  is  that  authorizations  on  Public-inf  o  can  only 
be  granted  by  a  DBA  (in  practice,  this  means  that  only 
the  DBA  can  issue  access  denials), 

A  number  of  predefined  authorization  policies  are 
supported  (denoted  in  the  reference  architecture  as  the 
authorization  policies  library),  including  the  following: 
traditional  closed  and  open  policies;  the  closed  pol¬ 
icy  with  negation  and  the  conflict  resolution  principle 
based  on  “denials  take  precedence”;  the  closed  pol¬ 
icy  with  negation  and  the  conflict  resolution  principle 
based  on  the  most  specific  authorization  takes  prece¬ 
dence”  [5].  In  particular,  the  last  two  policies  have 
conflict  resolution  principles  for  dealing  with  conflict¬ 
ing  authorizations^  granted  on  the  same  object  to  the 
same  subject.  It  is  also  possible  for  the  policy  offi¬ 
cer  to  specify  custom-made  policies  by  using  a  special 
purpose  language,  baaed  on  rules  [4]. 

A  number  of  predefined  administration  policies  are 
supported  (denoted  in  the  reference  architecture  as  ad¬ 
ministration  policies  library)  that  will  be  discussed  in 
the  next  section.  All  the  specifications  are  checked 


^Two  authorizations  conflict  if  one  is  a  positive  authorization 
whereas  the  other  is  a  negative  one. 
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Figure  1:  Architecture  of  the  multipolicy  authorization  system  (MPAS) 
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for  consistency  and  correctness  by  the  analyzer  [4]. 
The  results  of  the  analysis  is  a  policy  base  encoding 
the  administration  and  authorization  policies  for  each 
data  object. 

An  important  aspect  of  our  system  is  that  parameU 
ric  policies  can  be  also  specified.  A  parametric  policy 
is  parametric  with  respect  to  the  objects  to  which  it 
applies.  It  allows  to  specify  policies  such  as  “All  ta¬ 
bles  created  by  the  role  Secretary  must  be  authorized 
under  the  open  policy”  or  “All  documents  pertaining 
to  project  MPAS  must  be  authorized  under  the  closed 
policy”.  Parametric  policies  basically  contain  condi¬ 
tions  stated  against  authorization  attributes  of  data 
objects.  Authorization  attributes  contain  information 
about  the  administered  objects  that  is  relevant  from 
security  point  of  view.  They  compose  the  security  pro^ 
file  of  the  data  object.  Each  subject  also  has  a  similar 
security  profile.  Parametric  policies  are  very  impor¬ 
tant  for  systems  with  large  numbers  of  subjects  and 
objects  to  be  protected.  In  such  systems,  specifying 
policies  for  each  object  would  not  be  viable. 

2.2  Run-time  environment 

The  run-time  environment  consists  of  a  flexible  au¬ 
thorization  mechanism  which  is  the  core  of  MPAS,  and 
of  a  front-end.  The  flexible  authorization  mechanism 
implements  a  generalized  authorization  model  able  to 
support  a  large  number  of  policies  [5].  The  front-end 
has  the  task  of  verifying  the  user’s  and  DBA’s  request 
with  respect  to  the  policies  stated  by  the  policy  officer 
(stored  in  the  policy  base)  and  mapping  them  onto  the 
generalized  authorization  model. 

Example  2  Consider  table  Publicd.nf  o  from  Exam^ 
pie  1  for  which  an  open  authorization  policy  and  a 
DBA  administration  policy  have  been  specified.  Sup¬ 
pose  that  a  DBA  enters  a  positive  authorization  for  a 
certain  user  on  this  table.  Such  authorization  is  re¬ 
jected  by  the  front-end  because  only  negative  autho¬ 
rizations  can  be  specified  for  this  table^  according  to 
the  open  policy. 

3  Administration  Policies 

In  this  section  we  discuss  the  predefined  administra¬ 
tion  policies  supported  by  our  system.  Note,  however, 
that  additional  policies  can  be  specified  by  using  the 
rule-based  language  supported  by  the  policy  specifica¬ 
tion  environment. 

Before  discussing  the  policies  it  is  important  to  re¬ 
call  that  authorization  administration  consists  of  is¬ 
suing  grant  and  revoke  requests  to  the  authorization 
system.  Those  requests,  that  must  be  consistent  with 
the  administration  policies,  enter  or  remove  autho¬ 
rizations  from  the  authorization  base  (cfr.  Figure  1). 


They  are  issued  by  users  that  must  be,  in  turn,  proj>- 
erly  authorized.  We  will  also  refer  to  the  notion  of 
object  creator^  it  is  the  user  who  has  created  the  ob¬ 
ject  or  on  behalf  of  whom  the  object  has  been  created 
(for  example,  within  an  application  progreim  run  by 
the  user).^ 

The  following  administration  policies  are  supported 
by  MPAS: 

•  DBA  administration:  under  this  policy,  only 
the  DBA  can  issue  grant  and  revoke  requests  on 
a  given  object.*^  This  policy  is  highly  centralized 
(even  though  different  DBAs  can  manage  differ¬ 
ent  parts  of  the  database)  and  it  is  seldom  used 
in  current  DBMSs,  but  in  the  simplest  systems. 

•  Object  “curator’’  administration:  under  this 
policy,  a  user,  not  necessarily  the  creator  of  the 
object,  is  named  administrator  of  the  object.  Un¬ 
der  such  policy,  even  the  object  creator  must  be 
explicitly  authorized  to  access  the  object. 

•  Object  owner  administration:  under  this  pol¬ 
icy,  which  is  commonly  adopted  by  DBMSs  and 
operating  systems,  the  creator  of  the  object  is  the 
owner  of  the  object  and  is  the  only  one  authorized 
to  administer  the  object. 

The  second  and  third  administration  policies  above 
can  be  further  combined  with  administration  delega¬ 
tion  and  administration  transfer.  Even  though  delega¬ 
tion  and  transfer  could  be  applied  to  DBA  administra¬ 
tion,  we  do  not  have  included  this  possibility  because 
it  is  not  very  significant.® 

By  administration  delegation  we  mean  that  the  ad¬ 
ministrator  of  an  object  (either  the  owner  or  the  cura¬ 
tor)  can  delegate  other  users  administration  functions 
on  the  object.  Delegation  can  be  specified  for  selected 
access  modes,  for  example  only  for  read  operations. 
In  most  cases,  delegation  of  administration  to  another 
user  implies  also  granting  this  user  the  privilege  of  ac¬ 
cessing  the  object  according  to  the  same  access  mode 
specified  in  the  delegation.  Most  current  DBMSs  sup¬ 
port  the  administration  policy  based  on  the  owner  ad¬ 
ministration  with  delegation.  Note  that  under  the 
delegation  approach,  the  initial  administrator  of  the 
object  does  not  lose  his/her  privilege  to  administer 

®In  some  systems,  a  DBA  can  create  an  object  on  behalf  of 
some  users. 

Note  that  DBAs  have  also  the  authorization  to  g^ive  users 
the  privilege  to  connect  to  the  DBMS,  and  can  also  read  and 
write  all  data  objects  created  by  other  users. 

It  can  be  simply  implemented  in  terms  of  the  other  two 
policies. 
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Legend: 

bold  lines  denote  mutually  exclusive  administration  options  for  the  same  data  object 
non-bold  lines  denote  non-mutually  exclusive  administration  options  for  the  same  data  object 


Figure  2:  Taxonomy  of  the  administration  policies  supported  by  MPAS 


the  object.  Therefore,  authorizations  on  the  same  ob¬ 
ject  can  be  granted  by  different  administrators.  With 
respect  to  revoke  operations,  we  take  the  approach, 
common  to  most  systems,  that  only  the  grantor  o{  an 
authorization  can  revoke  it. 

Administration  transfer,  like  delegation,  has  the  ef¬ 
fect  of  giving  another  user  the  right  to  administer 
a  given  object.  However,  the  original  administrator 
loses  his  administration  authorization,  whereas  under 
delegation  the  original  administrator  keeps  his  admin¬ 
istration  right.  When  transfer  is  used  with  owner  ad¬ 
ministration  it  has  the  semantics  of  owner  transfer;  we 
call  it  ownership  transfer.  Therefore,  the  owner  of  the 
object  is  actually  replstced  by  the  user  who  has  been 
delegated  the  administration.  Transfer  is  very  rele¬ 
vant  in  workflow  applications,  where  objects  (such  as 
documents,  and  office  forms)  migrate  among  different 
departments  (or  organizational  units)  within  the  same 
organization.  Often,  because  of  those  transfers,  ob¬ 
jects  may  enter  different  administration  domains  and 
it  is,  thus,  important  that  the  privileges  of  adminis¬ 
tering  the  objects  be  properly  modified.  Also,  security 
requirements  may  dictate  owner  transfer.  Consider  a 
document  which  is  initialized  by  a  secretary  and  later 
on  transferred  to  her  boss  who  enters  reserved  infor¬ 
mation.  In  such  a  case,  it  is  important  that  the  initial 


owner,  i.e.,  the  secretary,  is  no  longer  authorized  to 
administer  the  object. 

When  dealing  with  transfer  an  important  question 
concerns  the  authorizations  granted  by  the  former  ad¬ 
ministrator  (such  problem  does  not  arise  in  delegation 
because  the  former  administrator  retauns  his  right  to 
administer  the  object) .  The  following  two  approaches 
are  supported  by  MPAS  for  dealing  with  authoriza¬ 
tions  granted  by  the  former  administrator: 

1.  Recursive  revoke:  all  authorizations  granted  by 
the  former  administrator  are  recursively  revoked. 

2.  Grantor  transfer,  all  authorizations  granted  by 
the  former  administrator  are  kept;  however,  the 
new  administrator  replaces  the  old  one  as  grantor 
of  the  authorizations  (and  is  thus  able  to  revoke 
them).  Note  that  the  grantor  transfer  is  not  re¬ 
cursive.  Therefore,  if  the  older  administrator  has 
delegated  other  users  for  administration,  those 
grants  are  left  in  place.  Only,  the  new  admin¬ 
istrator  becomes  their  grantor. 

We  provide  a  further  transfer  option.  Transfer  can 
be  with  acceptance  or  without  acceptance.  By  accep¬ 
tance  we  mean  that  the  user  to  whom  the  administra¬ 
tion  (or  ownership)  is  transferred  must  explicitly  ac- 
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Figure  3:  (a)  An  example  of  administration  graph;  (b)  the  graph  after  Bob  revokes  the  delegation  to  Tom;  (c) 
the  graph  after  Bob  transfers  the  ownership  to  John  with  grantor  transfer;  (d)  the  graph  after  Bob  transfers  the 
ownership  to  John  with  recursive  revoke 


cept  the  administration  responsibility.  Transfer  with¬ 
out  acceptance  means  that  such  explicit  acceptance  is 
not  required.  Explicit  acceptance  is  important  espe¬ 
cially  when  the  object  ownership  is  transferred.  Since 
the  object  owner  is  ultimately  responsible  for  the  ob¬ 
ject  and  may  be  liable  for  the  content  of  the  object, 
the  explicit  acceptance  avoids  that  a  user  be  trans¬ 
ferred  the  ownership  of  an  object,  without  even  know¬ 
ing  about  it. 

Our  model  also  supports  joint  administration  of 
data  objects.  Joint  administration  means  that  sev¬ 
eral  users  are  jointly  responsible  for  administering  the 
object.  Joint  administration  can  be  used  in  both  the 
object  “curator”  administration  and  object  owner  ad¬ 
ministration  policy.  Joint  administration  is  particu¬ 
larly  useful  in  computer-supported  cooperative  work 
(CSCW)  applications  where  typically  users  cooperate 
to  produce  a  complex  data  object  (such  as  a  docu¬ 
ment,  a  book,  a  piece  of  software,  a  VLSI  circuit). 
In  such  applications,  each  user  in  the  work  group  is 
responsible  for  producing  a  component  of  the  com¬ 
plex  object;  therefore,  no  single  user  is  the  owner  of 
the  entire  complex  object.  Authorization  for  a  user 
to  access  a  data  object,  administered  under  the  joint 
administration  policy,  requires  that  all  the  administra¬ 
tors  of  the  object  issue  a  grant  request.  As  a  further 
option,  MPAS  supports  joint  administration  with  quo¬ 


rum.  Under  such  option,  an  authorization  is  granted 
to  a  user,  if  a  number  of  administrators  equal  to  the 
quorum  have  issued  the  proper  grant  request. 

It  is  important  to  note  that  joint  administration 
can  be  combined  with  the  various  options  illustrated 
before.  For  example,  one  of  the  administrators  of  a 
data  object  may  transfer  or  delegate  its  administra¬ 
tion  right  to  another  user.  A  large  spectrum  of  ad¬ 
ministration  policies  is  thus  obtained. 

Figure  2  summarizes  the  administration  policies 
supported  by  MPAS.  In  what  follows  we  give  some 
examples  of  the  various  administration  policies  sup>- 
ported  by  our  model. 

In  the  examples,  we  represent  a  sequence  of  dele¬ 
gation  operations  as  a  graph,  called  administration 
graph.  In  such  a  graph,  a  boldface  node  denotes  the 
owner  of  an  object,  whereas  a  non-boldface  node  de¬ 
notes  a  user  who  has  been  delegated  the  administra¬ 
tion  right  on  the  object.  There  is  an  arc  from  node  i 
to  node  j  if  the  user  represented  by  node  i  has  dele¬ 
gated  the  administration  right  to  the  user  represented 
by  node  j;  each  arc  is  furthermore  labeled  with  the 
delegation  time.  An  example  of  administration  graph 
is  illustrated  in  Figure  3(a). 

Moreover,  sequences  of  grant  requests  for  a  given 
access  mode  m  on  a  given  object  o  are  represented  by 
means  of  authorization  graphs.  Nodes  represent 
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users.  There  is  an  arc  from  node  i  to  node  j  if  the  user 
represented  by  node  i  has  granted  an  authorization  for 
m  on  o  to  the  user  represented  by  node  j.  The  arc  is 
labeled  with  the  time  of  the  grant  and  with  the  granted 
access  mode.  Joint  authorizations  are  represented  by 
a  special  place-holder  denoted  by  a  vertical  boldface 
bar.  An  example  of  authorization  graph  is  shown  in 
Figure  4(a). 

The  following  example  illustrates  some  of  the  ad¬ 
ministration  options  illustrated  above  for  the  case  of 
single  (i.e.  non-joint)  administration. 

Example  3  Consider  a  table  T  which  is  administered 
tinder  the  owner  administration  policy  with  both  dele¬ 
gation  and  transfer  options.  Suppose  that  Bob  is  the 
owner  of  the  table.  Under  the  above  options,  Bob  can 
delegate  other  users  the  right  to  administer  the  object. 
Suppose  that  he  grants  such  right  to  Tom  at  time  100® 
and  to  Meiry  at  time  120.  Suppose,  moreover,  that  Tom 
tn  turn  delegates  Mary  the  administration  right  at  time 
110.  Figure  3(a)  illustrates  the  corresponding  admin¬ 
istration  graph. 

Consider  now  the  following  cases: 

1.  Bob  revokes  the  administration  right  from  Tom. 
As  a  consequence,  Mairy  loses  the  right  received 
from  Tom.  However,  she  keeps  the  right  received 
directly  from  Bob.  Figure  3(b)  shows  the  resulting 
graph. 

2.  Bob  transfers  at  time  210  the  ownership  to  John. 
Suppose  that  the  transfer  policy  has  been  specified 
with  the  grantor  transfer  option.  As  illustrated  by 
the  graph  in  Figure  3(c),  both  Tom  and  keep 
their  administration  rights.  The  graph  includes  a 
third  type  of  node,  called  shadow  node,  which  is 
used  to  keep  track  of  previous  object  owners. 

3.  Bob  transfers  at  time  210  the  ownership  to  John. 
Suppose  that  the  transfer  policy  has  been  specified 
with  the  recursive  revoke  option.  As  illustrated  by 
the  graph  in  Figure  3(d),  both  Tom  and  Mary  lose 
their  administration  rights.  As  in  the  previous 
case,  information  about  the  former  owner  is  kept 
in  the  graph. 

The  following  examples  illustrates  joint  administra¬ 
tion. 

Example  4  Consider  a  table  T,  administered  under 
the  joint  administration  policy.  Suppose  that  Bob  and 
Ken  are  the  owners  of  the  table.  Suppose  that  the  fol¬ 
lowing  grant  operations  are  performed: 

®  We  use  here  a  simplified  representation  of  time  as  an  integer 
number.  In  real  implementations,  the  system  timestamp  is  used. 


1.  Bob  grants  Laura  the  Read  access  on  T  at  time 
105. 

2.  Ken  grants  Laura  the  Read  access  on  T  at  time 
110. 

Figure  4(q)  shows  the  resulting  authorization  graph. 

Consider  the  following  access  requests  issued  by 
Laura; 

L  Read  access  to  T  at  time  105;  such  access  is  denied 
because,  at  time  105,  Laura  does  not  have  the 
authorization  from  Ken. 

2.  Read  access  to  T  at  time  115;  such  access  is  al¬ 
lowed  because,  at  time  115,  Laura  possesses  au¬ 
thorizations  from  both  Bob  and  Ken. 

The  following  example  illustrates  the  difference  be¬ 
tween  joint  administration  and  delegation. 

Example  5  Consider  again  table  T.  Suppose  that  it 
is  administered  under  a  non-yoint  policy  and  that  Bob 
is  its  owner.  Moreover,  suppose  that  Bob  delegates 
Ken  the  administration  right  at  time  80.  Suppose  now 
that  Bob  and  Ken  perform  the  same  grant  operations 
illustrated  in  Example  4-  Figure  4(b)  shows  the  cor¬ 
responding  administration  and  authorization  graphs. 
Consider  again  the  access  requests,  listed  in  Exam¬ 
ple  4f  issued  6y  Laura; 

1.  Read  access  to  T  at  time  105;  such  access  is  al¬ 
lowed,  because  Laura  possesses  the  authorization 
granted  from  Bob. 

2.  Read  access  to  T  at  time  115;  such  occess  is  al¬ 
lowed  because  Laura  possesses  two  authorizations, 
one  from  Bob  and  another  from  Ken. 

Delegation  and  transfer  also  differ  with  respect  to 
the  semantics  of  the  revoke  operations  [4]. 

Several  interesting  issues  are  related  to  the  seman¬ 
tics  of  the  delegation  option  when  combined  with  joint 
administration.  Indeed,  one  of  the  administrators  of 
the  object  may  delegate  other  users  the  administra¬ 
tion  of  the  object.  This  means  that  the  delegated  user 
may  issue  a  grant  request  instead  of  the  original  object 
administrator.  This  grant  request  must  be  combined 
with  the  grant  requests  from  all  the  other  adminis¬ 
trators  (or  from  users  delegated  by  them)  before  the 
authorization  czm  actually  be  issued.  If  a  quorum  op¬ 
tion  is  used,  multiple  grant  requests  from  an  admin¬ 
istrator  and  his  delegates  amount  to  a  single  request 
with  respect  to  the  quorum  computation.  Such  ap¬ 
proach  avoids  that  an  administrator  may  reach  the 
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Figure  4:  (a)  The  authorization  graph  for  Example  4; 
Example  5 


(b)  The  administration  and  authorization  graphs  for 


quorum  by  simply  delegating  other  users  the  adminis¬ 
tration  rights.  The  following  examples  illustrates  joint 
administration  with  delegation. 

Example  6  Consider  a  table  T,  administered  under 
the  joint  ownership  administration  policy  with  Quo^ 
rum  option.  Suppose  that  Bob,  Ken  and  George  are 
the  owners  of  the  table.  Suppose  that  the  quorum  is  2. 
Suppose  that  the  following  delegation  and  grant  oper¬ 
ations  are  issued: 

1.  Bob  delegates  John  the  administration  authoriza¬ 
tion  on  T  at  time  100; 

2.  Bob  grants  Laura  the  Read  access  on  T  at  time 
120; 

3.  John  grants  Laura  the  Read  access  on  T  at  time 
130; 

4^  Ken  grants  Laura  the  Read  access  on  T  at  time 
150. 

Figure  5  shows  the  resulting  administration  and  au¬ 
thorization  graphs. 

Consider  the  following  access  requests  issued  by 
Laura; 

L  Read  access  to  T  at  time  130;  such  access  is  not  al- 
lowedj  because  two  read  authorizations  have  been 
issued  for  Laura,  but  they  are  from  an  adminis¬ 
trator  and  a  delegate  of  him; 


2.  Read  access  to  T  at  time  160;  such  access  is  al¬ 
lowed  because  Laura  now  possesses  two  authoriza¬ 
tions  j  granted  by  two  different  administrators  (or 
their  delegates). 

Finally,  note  that  even  if  Bob  had  not  ^n^ed  the 
Read  authorization  to  Laura  at  time  120,  she  would 
still  be  able  to  access  T  a^  time  160  (provided  that  au¬ 
thorizations  (3)  and  (4)  above  had  been  grx^nted). 

We  refer  the  reader  to  [4]  for  a  discussion  and  for  a 
formal  definition  of  the  semantics  of  delegation  when 
combined  with  joint  administration. 

4  Formal  Model 

In  this  section  we  formalize  some  aspects  of  our 
authorization  administration  model. 

Let  O  be  the  set  of  objects  and  U  the  set  of 
users  in  the  system.  Let  VBA  denotes  the  set  of 
all  users  having  DBA  authorizations.  Let  M  denotes 
the  set  of  natural  numbers.  Moreover,  let  VS  = 
{DBA , ob j  ect-curat or , ob j  ect-owner , j  o int—ob j  ect 
-curator ,  joint -object-owner}  denotes  the  set  of 
administration  policy  types.  An  authorization  admin¬ 
istration  policy  is  defined  as  follows. 

Definition  1  (Administration  policy)  An  admin¬ 
istration  policy  is  a  7-tuple  Co, pt, delegation jopt, 
transf  er^pt  »acceptancejopt  ,revokejopt  ,vote. 
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Figure  5:  Administration  and  authorization  graphs  for  Example  6 


opt] ,  where  o  G  O,  pt  G  VSj  delegation-opt 
G  {delegation, no-delegation, nil},  transfer_opt 
G  {transfer,  no-transfer, nil},  accept  an  cejopt 
G  {acceptance, no-acceptance, nil},  revoke-opt 
G  {revoke, grantor-transfer, nil},  vote»opt  G 
{quorum, totality, nil}. 

In  the  above  definition,  a  number  of  components  of 
the  7-tuple  defining  an  administration  policy  are  actu¬ 
ally  flags  indicating  specific  options.  Also,  those  flags 
may  take  a  nil  value.  The  nil  value  simply  denotes  that 
the  flag  is  not  significant  for  the  specific  policy  type. 
Finally,  the  vote-opt  component  is  only  significant  for 
joint  administration  policy  (for  non-joint  administra¬ 
tion  policies  it  always  takes  the  nil  value).  It  takes  the 
totality  value  to  denote  that  for  an  authorization  to 
be  granted  all  the  administrators  must  have  issued  the 
proper  grant  requests;  it  takes  the  value  quorum  oth¬ 
erwise. 

Example  7  The  policy  specifications  below  illustrate 
the  above  definition: 

•  The  policy  specification: 

[Public-inf o ,DBA ,nil ,nil ,nil ,nil ,nil] 

states  that  table  Public-info  must  be  adminis- 
tered  by  the  DBA.  All  other  components  are  not 
significant  for  this  policy  and  are  thus  set  to  nil. 

•  The  policy  specification: 

[T , ob j  ect-owner , delegation , transfer , 
no-accept  2Lnce ,  grantor-transfer ,  nil] 


states  that  table  T  must  be  administered  by  the 
owner.  Moreover ^  both  delegation  and  transfer 
are  allowed  on  this  table.  Transfer  is  without 
acceptance  and  all  granted  administration  autho¬ 
rizations  are  not  revoked  if  the  ownership  is  trans¬ 
ferred.  This  policy  is  the  one  exemplified  in  case 
2  of  Example  3.  Note  that  the  vote-opt  is  nil 
since  the  policy  type  is  not  a  joint  one. 

•  The  policy  specification: 

[T , j  oint -object-owner , delegation , transfer , 
no-acceptance ,  grant  or-trainsfer ,  totality] 

states  that  table  T  is  under  a  joint  administra¬ 
tion  policy.  In  this  case  the  vote -opt  component 
indicates  that  all  the  administrators  must  issue 
grant  requests  for  an  authorization  to  be  actually 
granted  to  a  user. 

The  policy  administration  base  is  a  set  of  adminis¬ 
tration  policies,  denoted  as  VAB. 

Information  specified  by  the  administration  policies 
are  complemented  with  information  about: 

(i)  which  users  are  DBA; 

(ii)  which  administrators  have  delegated  (trans¬ 
ferred)  to  which  other  users  administration  au¬ 
thorizations; 

(iii)  for  each  data  object,  its  owner  (owners),  if  the 
object  is  administered  under  the  (joint)  owner  ad¬ 
ministration  policy,  or  its  “curator”  (“curators”) 
if  the  object  is  administered  under  the  (joint)  “cu¬ 
rator”  administration  policy; 
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Figure  6:  Administration  graph  with  non-independent  and  independent  administrators 


(iv)  for  each  object,  administered  under  a  joint  admin- 
istration  policy  with  quorum,  the  actual  quorum 
value  to  use. 

The  above  information  is  stored  as  facts  into  the 
policy  b£ise  (see  Figure  1). 

Definition  2  (Delegation  and  transfer  specifica* 
tion)  A  delegation  (transfer)  specification  is  a  triple 
[grantor, grantee,©],  with  grantor, grantee  6  V, 
and  o  SO. 

The  delegation  (transfer)  base  is  a  set  of  delegation 
(transfer)  specifications,  denoted  as  VC  (Til) . 

Definition  3  (Owner  and  curator  specification) 
An  owner  (curator)  specification  is  a  pair  [o,u],  tvith 
o  SO  and  u  SU.-u  is  referred  to  as  an  initial  admin¬ 
istrator  of  o. 

Note  that  when  the  joint  administration  policy  is 
used  for  an  object,  there  are  several  initial  adminis¬ 
trators  for  the  object. 

The  owner  (curator)  base  is  a  set  of  owner  (curator) 
specifications,  denoted  as  OWB  (CB). 

Definition  4  (Quorum  specification)  A  quorum 
specification  is  pair  [o,n],  with  o  SO,nsN. 

The  quorum  base  is  a  set  of  quorum  specifications, 
denoted  as  (QB). 

The  following  definition  introduces  the  notion  of 
delegates  and  states  how  the  delegates  of  a  given  ad¬ 
ministrator  are  determined  from  the  delegation  base. 
Note  that  an  administrator  can  have  indirect  dele¬ 
gates;  indeed,  an  administrator  may  delegate  a  user 
the  administration  right  and  this  user  may,  in  turn, 
delegate  other  users. 

Definition  5  (Delegates)  Let  u,'  and  uj  be  users. 
Let  o  be  an  object.  We  say  that  Uj  is  a  delegate  of 
Uf  for  the  administration  of  o  if  predicate  d(u,-,Uj  ,o), 
defined  below,  is  True. 


•  d(u,-,Uj,o)  =  True,  if  Lui .uj ,olS  VC; 

•  d(u,',Uj,o)  =  True,  if  there  exists  Uk  ^  U  such 
that  d(\ii  ,11k  ,o)  =  True  and  d(.nk  ,iij  ,o)  =  True. 

Let  o  be  an  object  and  let  u  be  a  user  such  that  u 
is  an  initial  administrator  of  o.  The  set  w4u  =  {u}  U 
{ui  I  tif  €  1/  and  d(u,Uf,o)  =  True}  denotes  a  set 
including  u  and  all  the  delegates  of  u. 

The  following  definition  states  the  notion  of  inde¬ 
pendent  administrators.  It  is  the  basis  of  the  rule  de¬ 
termining  when  a  grant  becomes  valid  in  the  case  of 
joint  administration. 

Definition  6  (Independent  administrators)  Let 
u,  and  iij  be  users.  Let  o  be  an  object.  We  say  that 
Hi  and  vij  are  independent  administrators  of  o,  if  the 
following  conditions  are  verified: 

•  3u{  initial  administrator  of  o  such  that  u<  €  Ax'; 

•  3u'-  initial  administrator  of  o  such  that  uj  e  A„r; 

•  ,3u*  initial  administrator  of  o  such  that  u,-  € 
and  iij  e  - 

u,-  and  Uj  are  called  non-independent  administrators, 
otherwise. 

The  above  definition  states  that  two  users,  who 
have  been  delegated  by  other  users  to  administrate 
a  given  object  are  independent  if  they  have  received 
their  administration  privilege  from  users  that  are,  in 
turn,  independent.  Note  that  the  initial  administra¬ 
tors  of  the  object  are  always  independent. 

Bxample  8  Consider  the  administration  graphs  in 
Figure  6.  .^sob  —  (Bob, Tom, John}  and  — 
(Ken,  Mary).  The  pairs  of  independent  administrators 
include:  (Bob, Ken),  (Bob, Mary),  and  (Tom, Keorj). 
The  pairs  of  non-independent  administrators  include: 
(Tom, John),  (Bob, John). 
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Function  grt[u,o) 

grt  :  {/  X  O  {Trut^  False) 

let  pt  be  the  policy  type  of  object  o  (determined  from  VAB) 
case  {pi): 

DBA:  if  tx  G  VBA,  then  True,  else  False] 
object-curator:  if  [o, «]  G  CB  or 

(3u'  such  that  [o,  u']  G  CB  and  d(u',  u,  o)  =  True)  then  True,  else  False; 
object-owner:  if  [o,  u]  G  OWB  or 

(3it'  such  that  [o,  u^  G  OWB  and  d(u',  u,  o)  ^True)  then  True,  else  False; 
joint-object-curator:  if  [o,  u]  G  CB  or 

(3u'  such  that  [o,u^  G  CB  and  d(it',u,o)  =  True)  then  True,  else  False; 
joint-object-owner:  if  [o,  u]  G  OWB  or 

(3u'  such  that  [o,  uT  G  OWB  and  d(tt',  u,  o)  —  True)  then  True,  ebe  False. 


Figure  7:  High  level  specification  of  function  grt 


The  following  rule  formally  states  when  a  set  of 
authorizations  granted  by  adminbtrators  of  an  object 
to  a  user  actually  enables  the  authorization  for  this 
user. 

Joint  administration  rule 

Let  GU  =  {ui, . . Mfi)  be  the  set  of  users  who  have 
granted  an  authorization  for  the  same  privilege  on  ob¬ 
ject  o  to  user  u.  Let  v  be  the  vote  policy  for  object 
o.  Let  GU*  be  the  maximal  subset  of  GU  such  that 
Vu,,Uj  G  GU*,  u,  and  uj  Eire  independent  administra¬ 
tors  of  o.^  The  granted  authorization  is  enabled  if  the 
following  conditions  are  verified: 

•  If  V  =  totality:  let  adm  be  the  number  of  ini¬ 
tial  administrators  of  object  o.  Then  card{GU*)^ 
must  be  greater  than  or  equal  to  adm. 

•  If  V  =  quorum:  let  q  be  the  quorum  required  for 
object  o.  Then  card{GU*)  must  be  greater  than 
or  equal  to  q. 

Note  that  the  above  definition  implies  that  when 
two  users  ui  and  U2  give  the  same  authorization  to 
the  same  user,  these  authorizations  are  both  effective 
to  enable  the  authorization  for  the  user  only  if  Ui  and 
U2  have  obtained  the  sidminister  privilege  by  two  in¬ 
dependent  sources,  that  is,  there  does  not  exist  a  user 
which  gave  the  administer  privilege  to  both  Ui  and  U2 . 

An  importsmt  function  that  must  be  defined  as  part 
of  the  model  is  that  of  checking  whether  a  user,  wish¬ 
ing  to  perform  a  grant  operation,  is  authorized  to  do 

^This  implies  that  there  does  not  exists  GU*  C  GU  such  that 
Vui,uj  €  GU',  u,  and  Uj  are  independent  administrators  of  o, 
and  GU*  C  GU'. 

^card(GU*)  denotes  the  cardinality  of  set  GU*. 


SO.  Authorization  to  perform  a  grant  operation  on 
an  object  depends  on  the  administration  policy  estab- 
lished  for  the  object.  Figure  7  reports  a  high  level 
specification  of  function  grt;  such  function  receives  as 
arguments  the  user  wishing  to  perform  the  grEmt  smd 
the  object  on  which  the  grant  is  to  be  issued.  It  re¬ 
turns  True  if  the  user  is  authorized  to  issue  the  grant, 
it  returns  False  otherwise. 

5  Conclusions 

In  this  paper  we  have  presented  an  overview  of 
the  administration  policies  supported  by  the  MPAS 
multipolicy  authorization  system.  The  system  sup¬ 
ports  a  variety  of  policies  that  are  obtained  by  pro¬ 
viding  several  options,  such  as  delegation  and  admin¬ 
istration.  Joint  administration  policies  Eire  sdso  sup>- 
ported.  Many  of  these  policies  are  useful  in  advanced 
applications,  such  as  workflow  systems  and  computer- 
supported  cooperative  work  and,  in  general,  coopera¬ 
tive  applications. 

A  notable  feature  of  MPAS  is  the  notion  of  para¬ 
metric  authorization  policies.  Such  authorization  poli¬ 
cies  are  very  useful  when  dealing  with  data  objects 
created  through  an  application  program  Eind  which 
need  to  be  accessed  from  the  ssune  application  pro¬ 
gram  (such  as  temporary  tables),  or  when  dealing  with 
a  large  number  of  users  and  data  objects,  as  in  most 
real  applications. 

The  system  is  currently  under  implementation.  A 
preliminary  implementation  of  a  flexible  authorization 
mechanism,  a  core  component  of  MPAS,  has  been 
completed.  We  are  extending  both  our  model  and 
architecture  to  provide  a  more  accurate  modeling  of 
roles,  along  the  lines  discussed  by  Sandhu  in  [14]  and 
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[13],  and  to  incorporate  temporal  authorization  fea¬ 
tures  [1].  A  final  research  direction  we  plan  to  pursue 

is  related  to  the  use  of  multipolicy  systems  in  hetero¬ 
geneous  systems  [10]. 
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Abstract 

This  paper  addresses  two  main  issues  of  the  DOK 
system j  that  is  the  design  of  a  framework  for  enforcing 
security  policies  and  a  secure  architecture  which  imple¬ 
ments  such  a  framework.  Federated  security  policies 
are  expressed  as  logic-based  expressions  (called  ^^ag¬ 
gregation  constraints'^)  specifying  the  different  combi¬ 
nations  of  transactions  that  a  user  is  not  allowed  to 
issue,  either  in  single  or  multiple  states  of  a  federa¬ 
tion, 

•  To  enable  efficient  monitoring  of  aggregation  con¬ 
straints,  state  transition  graphs  are  generated  to 
model  the  different  sub-computations  of  the  con¬ 
straints.  Two  marking  techniques,  namely  LMT 
(Linear  Marking  Technique)  and  ZMT  (Zigzag 
Marking  Technique),  are  proposed  to  detect  vio¬ 
lations  of  federated  security  policies, 

•  To  enable  an  effective  enforcement  of  security 
policies,  we  designed  a  secure  DOK  architecture 
using  specialised  agents:  (i)  coordination  agents 
allow  the  coordination  of  different  federated  ac¬ 
tivities,  (ii)  task  agents  perform  specific  tasks  of 
the  federation,  and  finally  (Hi)  database  agents 
provide  the  required  information  for  coordination 
and  task  agents. 

1  Motivation 

Relatively  few  databases  are  accessible  over  the  In¬ 
ternet.  With  today’s  technology  one  would  like  to 
encapsulate  a  database  and  make  it  available  over 
the  Internet.  A  client  using  such  databases  would 
browse  an  old  census  database,  look-up  references  in 
an  object-oriented  database  system,  access  descripn 
tions  and  pictures  over  the  Internet,  or  combine  differ¬ 
ent  information  using  NCSA  Mosaic,  WWW,  or  back¬ 
end  databases. 


Security  issues  mainly  prevent  unlimited  access  to 
a  wide  number  of  heterogeneous  databases.  We  be¬ 
lieve  that  this  issue  needs  only  to  be  addressed  within 
an  environment  in  which  databases  can  be  connected 
and  used  without  violation  of  their  security  policies. 
This  (federated)  environment  ensures  that  all  its  com¬ 
ponent  databases  can  be  used  separately  or  in  combi¬ 
nation  without  compromising  global  and  local  security 
policies. 

The  Distributed  Object  Kernel  (DOK)  project  [15] 
at  Royal  Melbourne  Institute  of  Technology  is  con¬ 
cerned  with  the  research  and  development  of  a  se¬ 
cure  database  middleware  to  effectively  search,  update 
and  combine  information  within  a  distributed  and  fed¬ 
erated  environment.  DOK  uses  CORBA  (Common 
Object  Request  Broker)  technology,  the  distributed- 
object  standard  developed  by  the  OMG  (Object  Man¬ 
agement  Group),  to  communicate  across  different 
database  platforms.  In  addition,  DOK  provides  feder¬ 
ated  services  allowing  clients  to  use  multiple  databases 
in  combination,  and  these  involve  query  service  [11], 
reengineering  service  [14],  and  reflection  service  [2]. 

This  paper  addresses  the  design  of  the  DOK  se¬ 
curity  se7*t;ice,  including  the  development  of  a  secure 
architecture  to  effectively  enforce  federated  security 
policies  in  the  context  of  autonomous,  distributed  and 
heterogeneous  databases.  Users  can  access,  update  or 
combine  data  from  different  databases  Involved  in  a 
federation  through  a  DOK  layer.  The  security  service 
is  responsible  for  a  detection  of  any  violation  of  local 
and  federated  security  policies,  and  triggering  appro¬ 
priate  actions  if  such  a  violation  is  found  prior  to  the 
manipulation  of  data.  To  implement  such  a  detection 
mechanism,  two  main  issues  need  to  be  addressed,  and 
these  include: 

•  The  enforcement  of  local  security  policies:  This 
involves  the  design  of  a  federated  access  control 
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mechanism  aiming  for  integration  of  different  ac¬ 
cess  controls  which  could  have  been  imposed  on 
local  databases.  Such  an  access  control  will  spec¬ 
ify  and  define  the  appropriate  rights  that  a  user 
is  granted  to  access  and/or  update  data  in  a  dis¬ 
tributed  and  heterogeneous  environment. 

•  The  enforcement  of  federated  security  policies: 
This  concerns  the  design  of  an  inference  mech¬ 
anism  which  manages  security  policies.  It  also 
prevents  a  user  from  obtaining  a  collection  of  data 
from  different  databases  that  may  enable  the  user 
to  infer  sensitive  information  [7], 

Regarding  the  first  issue,  we  proposed  in  [13] 
a  federated  access  control  (called  GAC)  allowing  a 
“generic”  expression  of  the  security  requirements  for 
databases  involved  in  a  federation.  Each  object  de¬ 
fined  within  the  DOK  federation,  also  called  a  virtuQl 
object,  has  a  set  of  associated  access  lists  ( ACLs)  which 
specify  the  different  access  rights  of  its  attributes.  The 
DOK  approach  describes  such  security  labels  in  terms 
of  basic  transactions  (e.g.,  read/write^pdate)  so  that 
they  become  independent  of  any  specific  access  control 
(e.g.,  DAC  or  MAC). 

This  paper  describes  a  solution  to  the  second  issue 
identified  above.  This  solution  involves  three  aspects: 

(1)  the  design  of  a  (simple)  language  to  express  differ¬ 
ent  typ«  of  aggregation  constraints,  (2)  a  technique 
to  monitor  these  constraints,  and  finally  (3)  a  logi¬ 
cal  architecture  and  procedures  to  enforce  the  security 
policies.  Our  solution  can  be  summarised  as  follows: 

(1)  A  logic-based  language,  called  FELL  (acronym  of 
FEderated  Logic  Language),  is  designed  to  model 
different  types  of  violation  of  federated  security 
policies  when  federated  transactions  are  submit¬ 
ted.  This  language  can  describe  both  combina¬ 
tion  of  transactions  issued  on  a  single  state  of 
a  federation  (called  static  constraints)  as  well  as 
those  issued  on  multiple  states  (called  dynamic 
constraints). 

(2)  When  aggregation  constr^nts  are  expressed  on  a 
given  federation,  then  these  are  transformed  into 
appropriate  data  structures,  called  state  transit 
tion  graphs,  so  their  monitoring  becomes  effi¬ 
cient.  The  nodes  of  such  graphs  model  the  sub¬ 
computations  of  the  constraints.  The  node  labels 
are  atomic  formulas,  which  are  used  during  the 
monitoring  process,  and  are  based  on  the  past 
and/or  present  state(s)  of  a  federation.  Due  to 
the  limited  size  of  the  paper,  the  focus  will  be 
on  the  graphs  which  have  at  least  one  “true”  ter¬ 
minal  node.  These  graphs  are  called  true  graphs 


and  an  appropriate  monitoring  technique  is  pro¬ 
posed  for  such  graphs  based  on  a  “linear-way”  of 
marking  nodes.  The  remaining  type  of  graphs, 
called  false  graphs,  require  a  different  monitor- 
ing  technique  in  which  nodes  are  not  marked  in 
a  sequential  order.  [12]  proposes  a  specific  tech¬ 
nique  enabling  the  construction  and  marking  of 
false  graphs. 

(3)  The  DOK  architecture  is  a  three-layer  architec¬ 
ture  involving  a  Coordination  layer.  Task  layer 
and  Database  layer.  Each  of  these  layers  con¬ 
tains  specialised  agents  that  enforce  a  certain  part 
of  the  federated  security  policies.  Coordination 
tasks  (e.g.,  finding  an  appropriate  agent  to  pro¬ 
cess  a  certain  request)  are  performed  by  agents 
such  as  the  DOK  Manager.  The  enforcement  of 
the  security  tasks  (e.g.,  constraint  mauntenance) 
is  performed  by  specialised  agents  such  as  the 
Constraint  Manager.  Finally,  the  database  func¬ 
tions  (e.g.,  retrieval  of  information  about  a  spe¬ 
cific  user)  are  implemented  by  the  user  and  data 
agents. 

This  paper  is  organised  as  follows.  The  next  sec¬ 
tion  overviews  the  DOK  environment.  In  section  3  we 
briefly  describe  the  FELL  language.  A  framework  for 
monitoring  constraints  is  proposed  in  section  4.  The 
description  of  the  appropriate  security  agents  for  the 
enforcement  of  federated  security  policies  is  given  in 
section  5.  Finally,  in  section  6  we  conclude  with  the 
current  and  future  work. 


2  Expressing  Aggregation  Constraints 

This  section  describes  the  syntax  of  the  language  to 
be  used  for  the  expression  of  federated  security  policies 
in  a  DOK  environment.  Prior  to  that,  we  will  first 
introduce  the  different  elements  of  a  DOK  application, 
i.e.  the  reference  model. 

2.1  The  Reference  Model 

DOK  [15]  is  a  set  of  managers  which  oversee  the 
smooth  running  of  a  federated  system  and  is  respon¬ 
sible  for  ensuring  the  operational  requirements  of  a 
federation.  Users  interact  with  a  federation  through 
the  local  external  schema  of  one  of  the  component 
databases,  implemented  in  the  local  wrapper.  Users’ 
requests  involving  remote  data  are  analysed  by  the 
local  wrapper  and  re-directed  to  the  DOK  Manager, 
which  has  to  ensure  proper  trzmsaction  management, 
concurrency  control  and  query  management. 
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A  DOK  schema  is  defined  by  a  set  of  virtual  ob¬ 
jects  and  relationships  (such  as  aggregation  and  in¬ 
heritance  relationships).  Contrarily  to  conventional 
objects,  which  define  physical  stored  entities,  virtual 
objects  describe  conceptual  entities  defined  as  aggre¬ 
gation  of  objects  of  a  distributed  system.  Figure  1  il¬ 
lustrates  an  example  of  virtual  objects  in  which  their 
corresponding  attributes  are  defined  by  “picking  up” 
information  from  three  databases,  namely  a  personal 
database  (pDB  -  which  stores  information  about  staff 
members  of  different  department  of  a  given  univer¬ 
sity),  a  student  database  (stDB  -  which  stores  infor¬ 
mation  about  students  and  their  results),  and  a  bitmap 
database  (bitDB  -  which  stores  pictures  of  both  staff 
and  students  of  different  departments).  The  virtual 
object  Department  is  built  by  references  to  informa¬ 
tion  located  in  the  databases  pDB  and  bitDB.  In  a 
similar  way,  the  virtual  object  Student  contains  three 
types  of  information:  Looks  —  like  (which  refers  to  a 
picture  in  bitDB),  Personal  —  information  (which 
refers  to  a  relation  or  view  of  stDB)  and  Results 
(which  is  a  SQL  query  on  stDB  constructing  the  re¬ 
sults  of  a  student).  The  reader  may  refer  to  [15]  for 
more  details  about  the  DOK  reference  model. 
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Figure  1;  An  Example  of  Virtual  Objects 


Using  the  GAG  model  proposed  in  [13],  local  secu¬ 
rity  requirements  are  transformed  into  the  federated 
level  to  allow  the  understanding  of  the  local  security 
requirements  by  the  DOK  managers.  GAG  is  a  generic 
access  control  which  integrates  most  of  the  existing  ac¬ 
cess  controls,  such  as  MAG  and  DAG. 


2.2  The  FELL  Language 

This  section  describes  the  syntcLX  of  FELL,  the 
acronym  for  Federated  Logic  Language,  which  is  de¬ 
signed  to  be  used  by  a  security  administrator  to  ex¬ 
press  the  different  security  requirements  of  a  federa¬ 
tion.  This  language  provides  logic  constructs  to  ex¬ 
press  aggregations  constreiints  over  a  database  fed¬ 
eration,  however  it  does  not  provide  any  interpreta¬ 
tion  (or  monitoring)  of  these  constraints.  As  stated 
in  the  first  section,  the  processing  of  the  aggregation 
constraints  is  performed  by  generating  the  appropri¬ 
ate  data  structures,  i.e.  state  transition  graphs,  and 
marking  them  later  to  detect  any  violation  of  secu¬ 
rity  policies.  Due  to  the  size  limitations  of  the  paper, 
the  semantics  of  the  FELL  expressions  will  be  not  dis¬ 
cussed. 

fell’s  expressions  enable  the  modelling  of  situ¬ 
ations  where  a  user’s  set  of  transactions  are  not  per¬ 
mitted  to  be  processed  in  combination  on  the  different 
states  of  a  federation,  even  though  the  user  has  all  the 
appropriate  access  rights  in  the  individual  databases  of 
the  federation.  These  situations  generally  lead  to  the 
violation  of  federated  security  policies  because  the  user 
can  infer  more  sensitive  information  in  which  he/she 
does  not  have  the  required  access  rights  [7].  The  dif¬ 
ferent  situations  which  lead  to  the  violation  of  fed¬ 
erated  policies  are  expressed  as  FELL’s  expressions, 
called  aggregation  constraints.  In  the  DOK  environ¬ 
ment,  we  distinguish  between  two  types  of  aggrega¬ 
tion  constraints:  static  constraints  and  dynamic  con¬ 
straints.  The  former  are  modelled  as  predicates  on  a 
single  state  of  a  federation,  meaning  that  when  these 
predicates  are  evaluated  as  true  for  a  given  state,  ap¬ 
propriate  actions  are  triggered.  Dynamic  constraints, 
on  the  other  hand,  are  predicates  specified  over  a  se¬ 
quence  of  states. 

fell’s  expressions  related  to  static  constraints  are 
first  order  predicate  logic.  Dynamic  constraints  are 
temporal  logic  expressions  which  incorporate  the  tem¬ 
poral  operators  always,  sometimes  and  next.  In 
what  follows,  we  propose  some  examples  of  static  and 
dynamic  constraints  on  the  virtual  object  of  Figure  1. 

EXAMPLE  1  “the  user  ui  cannot  read  the  attributes 
salary  and  name  of  staff  employee  at  the  same  time" 

If  we  take  into  account  different  scenarios  about  the 
states  of  a  federation,  the  above  constraint  may  have 
two  possible  interpretations.  The  first  interpretation 
relates  to  a  situation  where  the  constraint  is  checked 
within  a  single  state  (i.e.,  the  system  checks  whether 
the  user  ui  has  issued  a  transaction  in  which  the  at¬ 
tribute  salary  and  name  are  read  at  the  same  time). 
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These  constraints  are  static  constraints.  Also,  the  user 
til  may  have  issued  two  transactions  in  which  he/she 
reads  only  one  attribute  at  the  same  time.  This  means 
that  the  constraint  needs  to  be  checked  over  different 
states  of  a  federation.  This  last  interpretation  refers 
to  what  we  call  dynamic  constraints, 

2.2.1  Static  Constraints 

These  are  constraints  which  apply  over  only  a  single 
state  of  a  federation.  They  express  aggregations  which 
forbid  users  to  combine  data  within  a  single  transac¬ 
tion.  For  example,  the  constraint  stated  in  Example 
1  is  a  static  constraint  and  can  be  expressed  as  two 
FELL  formulas: 

(1)  V  c:  Staff,  if  read(«i  ,c.name) 
implies  not(re£ui(tii  ,c.salary)) 

(2)  V  c:  Staff,  if  read(iii  ,c.salary) 
implies  not(read(tii  ,c.name)) 

These  formulas  basically  express  the  order  in  which 
the  user  issues  transactions.  In  (1),  the  user  has  first 
read  the  name  of  a  staff,  whereas  in  (2)  the  user  has 
started  to  read  the  salary  of  the  employee.  In  general 
cases,  FELL  provides  logic  constructors,  such  as  term, 
atomic  formula  and  well  formed  formula  to  construct 
logical  expressions  of  aggregation  constraints.  A  term 
designates  either  a  constant,  a  predicate  over  a  vir¬ 
tual  object,  a  variable,  or  projected  variable.  FELL 
also  supports  the  definition  of  atomic  formulas  as  a 
composition  of  terms  using  specific  rules.  For  in¬ 
stance,  if  Vj  Q  and  H  are  terms,  then  r€ad{Vj  fi), 
write{V^  delete(P}  are  atomic  formulas.  Using 

the  concept  of  atomic  formula,  one  can  construct  more 
complex  structures  of  the  universe  of  discourse.  These 
FELL  structures  are  called  well  formed  formulas  (in 
short,  formula)  and  they  can  be  built  using  appropri¬ 
ate  rules. 


2.2.2  Dynamic  Constraints 

These  are  defined  over  several  states  of  a  federation 
(e.g.,  past,  present  and  future).  They  express  long¬ 
term  data  dependencies  between  remote  states  in  the 
evolution  of  a  federation.  Thus,  the  different  states  in 
a  sequence  must  be  inspected  as  a  whole  in  order  to 
detect  a  violation  of  a  constraint. 

Let  us  consider  an  example  of  a  dynamic  constraint. 
Using  the  example  of  Figure  1,  one  can  stipulate  that 
the  only  user  who  can  update  the  salary  of  an  em¬ 
ployee  is  the  boss  of  the  department.  This  constraint 
can  be  expressed  as  follows: 


V  e,  d,  u,  e:Staff,  diDepeirtment,  u:User, 
always  (e  €  d.staff  and  d.boss  =  u) 
before  write(u,  e.salary,  .) 

Another  alternative  is  to  allow  the  update  of  the 
salary  of  an  employee  only  by  people  who  have  been 
the  head  of  a  department  at  least  once.  This  con¬ 
straint  relaxes  the  previous  formula  in  which  only  one 
state  of  a  federation  is  required  to  satisfy  the  assump¬ 
tion  related  to  the  fact  that  a  user  has  been  a  boss  of 
a  department.  This  constraint  is  expressed  as  follows: 

V  e,  d,  u,  e:Staff,  diDepartment,  u:User, 
sometimes  (e  €  d.staff  and  d.boss  =  u) 
before  write(u,  e.salary,  .) 


3  State  Transition  Graphs 

So  far  we  have  presented  some  of  the  concepts  re¬ 
lated  to  the  DOK  model  and  the  FELL  language.  This 
section  deals  with  the  analysis  of  aggregation  con¬ 
straints. 

Analysis  of  constraints  consists  of  both  detecting  in¬ 
consistencies  and  transforming  them  into  a  form  that 
can  be  monitored.  The  task  of  the  monitor  is  to  check 
whether  federation  states  are  admissible  in  the  context 
of  a  certain  federation  history.  This  task  is  performed 
using  state  transition  graphs  to  identify  any  violation 
of  security  policies.  These  graphs  consist  of: 

•  Labelled  nodes  with  temporal  formulas.  The  ini¬ 
tial  node  is  labelled  with  the  constraint  itself.  The 
nodes  reflect  the  conditions  which  have  to  be  ful¬ 
filled  by  the  future  states  of  database  objects. 

•  Labelled  edges  with  non-temporal  ones.  The  edges 
reflect  the  transition  between  constraints,  and 
their  labels  indicate  under  which  conditions  such 
a  transition  may  occur. 

Before  going  into  a  detailed  description  of  the  con¬ 
sistency  checking  mechanism,  we  first  give  a  few  ex¬ 
amples  of  state  transition  graphs  associated  with  ag¬ 
gregation  constraints.  Four  types  of  constraints  are 
described. 

CASE  1:  The  constraint  in  the  form  of  [sometimes 
rp].  The  following  is  an  example  of  such  constraint 

V  5,  3  u,  s:Student,  u:User, 
sometimes(write(u,s.Results,.)) 
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Figure  2:  ^^‘someiimes^  predicate 


which  stipulates  that  there  exists  a  user  who  can  ui>- 
date  the  results  of  all  students.  The  type  User  is  a  pre¬ 
defined  type  and  used  for  defining  FELL  expressions. 
The  corresponding  state  transition  graph  is  shown  in 
Figure  2. 

Each  student,  say  c,  belongs  to  one  of  the  two  nodes 
of  the  graph  illustrated  in  Figure  2.  We  assume  that 
the  constraint  is  in  the  form  of  [sometimes  rl>]. 

•  If  in  the  current  state  of  a  federation,  the  object 
c  does  not  verify  the  formula  then  c  will  be 
used  again  in  the  future  states  to  check  In  this 
situation,  the  validation  process  of  the  constraint 
is  still  in  the  node  (2)  and  the  constraint  can  be 
re-expressed  as  [existnext  ^],  meaning  that  is 
required  to  be  checked  in  all  future  states. 

•  If  e  verifies  the  formula  then  the  corresponding 
temporal  constraint  is  valid.  The  node  (3)  of  the 
state  transition  graph  is  reached. 

CASE  2:  This  case  deals  with  temporal  formulas  in 
the  form  of  [always  ^].  The  state  transition  graph  of 
this  type  of  constraint  is  different  from  the  previous 
one  because  the  temporal  formula  ip  needs  to  be  valid 
in  every  future  state  of  a  federation.  Figure  3  illus¬ 
trates  the  graph  of  the  following  constraint. 

V  s,  3  u,  s:Student,  u:User, 
always(write(u,s.Results,-)) 

In  a  given  state  of  a  federation 

•  if  any  object  s  of  type  Student  can  be  updated 
by  a  given  user,  say  u,  then  the  constraint  of  type 
[always  ip]  becomes  true  in  the  current  state  of 
a  federation.  Additionally,  the  constraint  ip  is 
required  to  be  valid  in  all  the  future  states.  This 
constraint  is  re-expressed  as  [allnext  ip]. 


Figure  3:  ^always^  predicate 

•  if  the  formula  ip  is  not  fullfiled  in  the  current  state, 
i.e  there  exists  an  object  s  which  cannot  be  up¬ 
dated  by  any  user,  then  the  constraint  always  ip 
is  no  longer  valid. 


CASE  3:  This  case  is  concerned  with  the  order  of 
validation  of  constraints.  One  situation  can  be  in  the 
form  of  [always  ip  before  <p]^  where  ip  and  ip  are  non¬ 
temporal  constraints.  Let  us  consider  the  following 
constraint: 

V  s,  V  u,  s:Student,  urUser, 
always  (read  (u,s. name))  before 
write(u,s.  Results,.) 

This  constraint  specifies  that  a  modification  of  a  stu¬ 
dent’s  results  is  authorised  only  if  the  user  has  already 
checked  the  name  of  the  student.  The  state  transition 
graph  of  such  a  constraint  is  shown  in  Figure  4. 


Figure  4:  *^always  ~  before’”  predicate 
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CASE  4:  The  constraint  is  in  a  form  of  [sometimes  xj) 
before  ^].  The  following  example  illustrates  such  types 
of  constraints. 

V  s,  V  u,  s:Student,  u:User, 

8ometimes(read(u,  s.name))  before 
write(u,s.Results,-) 

Figure  5  describes  the  state  transition  graph  for  the 
above  constraint. 


Figure  5:  ^^‘sometimes  -  before^  predicate 

The  basic  state  transition  graphs  for  temporal  formu¬ 
las  are  shown  in  Figure  6.  Using  these  graphs,  one  can 
build  state  transition  graphs  for  complex  constraints 
over  multiple  states  of  a  federation.  These  complex 
constraints  are  built  from  other  constraints  by  logical 
composition  of  atomic  formulas.  Thus,  the  state  tran¬ 
sition  graphs  of  such  constraints  are  built,  in  a  similar 
way  to  their  logic  representations,  by  aggregation  of 
the  basic  state  transition  graphs  shown  on  Figure  6. 

However,  a  distinction  between  aggregation  of  state 
transition  graphs  having  no  True  terminal  nodes  and 
those  that  do  have,  needs  to  be  made.  This  distinction 
between  these  types  of  graph  is  important  because  of 
the  differences  in  the  way  their  corresponding  formu¬ 
las  are  to  be  monitored.  The  first  category  of  graphs 
relates  to  formulas  that  could  not  be  always  true  after 
a  certain  state  of  a  federation.  This  is  mainly  because 
these  graphs  do  not  contain  a  True  terminal  node. 
We  call  these  graphs  false  graphs.  The  second  cat¬ 
egory  of  graphs  involves  those  which  contain  a  True 
terminal  node.  The  True  node  of  these  graphs  can  be 
reached  in  some  given  state  of  a  federation,  meaning 
that  the  corresponding  formulas  are  and  will  be  val¬ 
idated  in  all  the  future  states  of  a  federation.  These 
graphs  are  called  true  graphs. 

False  graphs  have  been  studied  in  [12],  where  al¬ 
gorithms  for  building  and  monitoring  such  graphs  are 


Figure  6:  The  Basic  State  Transition  Graphs 

provided.  This  paper  deals  with  true  graphs. 

3.1  True  Graph  Composition 

Let  us  first  consider  an  example  of  constraints 
which  are  based  on  true  graphs.  The  following  con¬ 
straint  stipulates  that  the  user  u  can  never  aggregate 
the  attributes  salary  of  an  employee  and  the  name  of 
the  department  where  the  employee  is  working. 

V  e,  V,  d,  eiStaff,  diDepiirtment 
soinetinies(Fead(ti,d.naine))  A 
sometimes(read(ti,e.salary))  A 
e  is-in  d.staff  implies  abort 

The  above  formula  is  a  logical  composition  of  three  for- 
rnulas:  (V'l)  [sometimes(  read(ti,d.name)],  {^2)  [some- 
times{  read(u,e.salary)]  and  (^3)  [(e  is-in  d.staff)].  Us¬ 
ing  the  basic  state  transition  graphs  (shown  in  Fig¬ 
ure  6),  the  state  transition  graphs  of  such  atomic  for¬ 
mulas  can  be  built.  Figure  7  illustrates  these  graphs, 
denoted  as  Gi,  G2  and  G3.  Note  the  graph  G3  is  a  sim¬ 
ple  graph  because  the  corresponding  formula  checks  in 
each  state  of  a  federation  whether  the  user  u  is  read- 
ing  information  about  a  staff  member  of  a  department. 
The  state  transition  graph  for  the  complex  constr^dnt 
A  “02  A  03  is  defined  as  a  set  of  all  possible  aggre¬ 
gations  of  the  graphs  Gi,  G2  and  G3,  i.e.  {Gi  0  G2 
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0  Ga,  Gi  0  G3  0  G2,  G2  0  Gi  0  G3,  G2  0  G3 
0  Gu  G3  0  Gi  0  G2,  G3  0  G2  0Gi},  where  0 
represents  the  graph  aggregation  operator.  This  op¬ 
erator  connects  the  nodes  of  the  concerned  graphs  to 
form  a  complex  state  transition  graph.  Formally,  this 
is  defined  as  follows: 


iC3) 


Figure  7:  Examples  of  State  Transition  Graphs 


ALGORITHM  1  (Graph  Composition)  Let  us  consider 
two  true  graphs  Gi  =  (  {n,-, ,  •  •  n,  {U, ,  •  • }) 
and  Gj  =  {  {nj,,  •••,  {<j,,  •••,  where 

is  a  node  type  and  iy  is  a  transition.  We  assume 
that  these  graphs  are  state  transition  graphs  of  the  for- 
mulas  rpi  respectively  .  The  composition  of  Gi  and 
Gj,  denoted  as  G,-0Gj,  produces  the  state  transition 
graph  G,j  =  for  the  formula  A 

where 

1.  1  =  IS  the  set  of  all 

nodes  of  the  two  graphs  Gi  and  Gj. 

2.  ff  ^  *  *  *  1  1  >  *  •  • » the  set 

of  transitions  of  Gij.  The  transition  tnew  from 
n,  ^  to  nj^  of  Gi  and  Gj  respectively  is  denned  as 
follows: 

(a)  find  a  node  of  Gi  which  is  labelled  as  True, 
say  ni,^; 

(b)  delete  the  cyclic  transition  of  the  node  nj^; 

(<^)  */^ir  first  node  that  is  reached  from 

the  idle  node  in  Gj,  then  create  a  transition 
^Titw  from  nii^  to  nj^  and 


(d)  label  the  transition  tnew  os  true. 

A  True  node  of  the  graph  Gi  is  used  as  a  basis  to 
build  the  complex  graph  Gij.  During  the  checking  of 
the  validity  of  the  complex  formula  V»ij,  if  the  True 
node  of  Gi  has  already  been  reached  (or  marked  as  we 
will  see  later),  then  the  next  step  in  the  processing  of 
rl^ij  will  be  to  check  the  formula  tl^j  using  the  graph 
Gj.  Since  the  True  terminal  node  of  the  graph  Gi  has 
a  cyclic  transition,  and  to  eJlow  a  transition  from  Gi 
to  Gj,  this  cyclic  transition  needs  to  be  deleted  (as 
stated  in  2-b). 

The  last  stage  in  building  the  graph  Gij  is  to  make 
the  transition  between  the  True  node  of  Gi  and  one  of 
the  nodes  of  the  graph  Gj.  As  stated  in  the  condition 
2-c,  the  first  node  in  Gj  which  is  reached  after  the 
idle  node  is  used  in  the  next  stage  of  the  processing  of 
the  formula  V>ij.  Finally,  the  conditions  (2-c  and  2-d) 
create  and  label  the  transition  tnew  with  true  to  allow 
a  direct  transition  from  Gi  and  Gj. 

EXAMPLE  2  Figure  8  shows  some  of  the  composition 
performed  on  the  graphs  Gi,  G2  and  G3  of  Figure  7. 

The  graph  generated  by  using  the  composition  op¬ 
erator  0  can  have  some  redundant  nodes  that  are  not 
required  to  be  checked  during  the  monitoring  of  ag¬ 
gregation  constraints.  These  nodes  relate  to  the  non¬ 
terminal  True  nodes  that  have  been  used  during  the 
composition  to  link  different  graphs.  If  we  consider 
for  instance  the  composition  graph  Gi  0  G2  0  G3  of 
Figure  8,  we  notice  that  some  nodes  of  such  a  graph 
are  always  true.  This  makes  their  checking  unneces¬ 
sary.  These  nodes  basically  allow  the  transition  from 
the  checking  of  one  formula  (i.e.,  in  our  case)  to  the 
checking  of  another  formula  (i.e.,  ^j).  The  terminal 
True  nodes  cannot  be  deleted  because  they  related  to 
the  situation  where  the  complex  formula  is  proven  to 
be  true. 

ALGORITHM  2  (Graph  Simplification)  Given  a  graph 
G  =  {LyJ)  and  a  non-terminal  True  node  n,-  con¬ 
necting  two  nodes  and  Ui^i  with  transitions 
and  tj-i-i  respectively,  then  we  construct  an  equivalent 
graph  G^  =  {I  -  {ni},J  -  {frtic})  of  G,  where 

•  true  is  the  label  of  the  transition  from  n,-  to  ni+i; 

•  IS  a  transition  from  n,-.i  to  ni^i 
is  an  equivalent  to  G. 

EXAMPLE  3  Figure  9  shows  two  successive  simpli¬ 
fications  of  the  initial  state  transition  graph  G  = 
Gi0G2  0G3  of  Figure  8,  denoted  as  G^  and  G^. 
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Figure  8:  The  composition  of  graphs  Gi,  G2  and 
C3,  where  A  =  sometimes(read(ui,d.name))p  B  = 
sometimes(read(izi,  c.salary)),  and  C  =  e  is-in  d.StafF 

Note  that  the  graph  G^  cannot  be  simplified  anymore 
even  though  it  has  a  True  node.  This  node  is  a  termu 
nal  node  and  describes  the  final  state  of  the  computa¬ 
tion  of  the  complex  formula  A  ^2  A  ^3.  The  graph 
C?2  15  called  a  reduced  graph  of  G, 


4  Marking  Algorithms 

Monitoring  aggregation  constraints  consists  of 
checking  whether  the  federation  states  are  admissible 
in  the  context  of  a  certain  federation  history.  In  this 
section  we  describe  two  algorithms  which  enable  the 
monitoring  of  aggregation  constraints  by  marking  the 
nodes  of  state  transition  graphs. 

For  a  given  complex  formula,  e.g.,  ^  =r  A ^2  A  ^3, 
there  may  exist  several  state  transition  graphs  (as 
shown  in  Figure  8).  Thus,  many  reduced  graphs  can 
be  derived.  The  graph  is  an  example  of  a  reduced 
graph  of  the  constraint  The  checking  of  a  com¬ 
plex  formula  is  performed  on  the  reduced  state  transi¬ 
tion  graphs  generated  from  its  initial  state  transition 


Figure  9:  The  Simplification  of  G  =  Gi  0  G2  0  G3, 
where  A  =  sometimes(read(ui,d.name)).  B  = 
sometimes(read(tii,e.salary)),  and  C  =  e  is-in  d.StafF 

graphs  by  marking  the  nodes  that  have  been  found 
true  in  all  past  states  of  a  federation.  Using  the  ex¬ 
ample  of  Figure  9,  the  checking  of  the  constraint 
can  be  done  by  marking  the  nodes  of  the  different  re¬ 
duced  graphs.  For  instance,  if  we  assume  that  the 
node  A  has  been  validated  in  one  of  the  past  states  of 
a  federation,  i.e.  the  user  u  has  already  read  the  name 
of  a  department  (say  cf),  then  there  are  two  possible 
scenarios  to  mark  the  remaining  nodes  of  the  state 
transition  graphs: 

1.  either  the  user  u  is  reading  the  salary  of  an  em¬ 
ployee,  say  c,  in  the  current  state  of  a  federation. 
This  means  that  the  node  B  will  be  marked,  and 
the  next  step  in  the  checking  of  the  constraint  xj) 
is  to  evaluate  the  node  labelled  as  G; 

2.  or  the  node  labelled  by  G  needs  to  be  evaluated 
before  checking  the  node  labelled  by  i.e.  check 
whether  the  staff  member  (in  which  the  user  u  has 
read  its  information)  is  working  in  a  department 
(which  has  been  read  by  the  user  u). 

The  first  scenario  follows  a  linear  marking  of  the  re¬ 
duced  graphs.  We  call  such  a  marking  technique  LMT 
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(Linear  Marking  Technique)^  and  a  node  is  marked  if 
and  only  if  all  its  preceedent  nodes  have  been  marked 
either  in  the  past  or  current  states  of  a  federation.  In 
contrast  to  the  first  scenario,  the  second  scenario  fol¬ 
lows  an  anarchic  way  of  marking  (called  Zigzag  Mark¬ 
ing  Technique)^  where  any  node  of  a  reduced  graph  can 
be  marked  when  its  corresponding  formula  is  made 
true  in  the  current  state. 

4.1  Linear  Marking  Technique 

Let  us  consider  a  formula  \l>  with  GS  •••, 

•••,  as  a  set  of  all  its  reduced  (true)  graphs. 
We  assume  that  these  graphs  are  defined  as  fol- 
lows: 

0.1. •••.<..-.}>,  <?"  = 

We  also  assume 

that  the  nodes  of  these  graphs  can  be  marked  by  a 
function,  namely  mark:  M  — y  {+,?},  where  M  is 
the  set  of  possible  nodes.  In  the  initialisation  phase, 
the  nodes  of  the  reduced  graphs  are  marked  as  follows: 

%  initialisation  phase  % 
for  each  i,  l<i<m 

for  each  j,  l:^<k 

if  label(n,-^ )  ^  True 
then  mark(ni^- )  =  ? 
else  mark(ni^  )  =  -|- 

where  the  function  label  returns  the  label  of  a  node. 
In  the  above  initialisation  phase,  all  the  nodes  of  the 
reduced  true  graphs  (except  the  nodes  labelled  with 
True)  are  marked  with  the  symbol  meaning  that 
their  corresponding  formulas  have  not  been  validated. 
The  nodes  having  True  as  a  label  are  marked  with  the 
symbol  “-f”. 

We  assume  that  at  time  t,  the  reduced  graphs  have 
been  marked  according  to  the  different  events  that 
occurred  in  the  past  states  of  a  federation.  At  time 
f  +  At,  which  is  the  current  state  of  the  federation, 
an  event  occurs  where  the  user  is  reading,  updating, 
or  combining  information  from  different  virtual  o^ 
jects.  We  denote  these  events  by  the  different  sub¬ 
transactions  (i.e.  read  and  write)  which  are  initi¬ 
ated  by  the  user  of  the  federation.  We  denote  these 
sub-transactions  as  The  LMT  algorithm 

checks  the  different  reduced  graphs  and  traverses  the 
nodes  marked  with  the  symbol  The  nodes  that 
match  with  the  sub-transactions  •  •  • ,  s/,  are  marked 
with  the  symbol  “-h”,  otherwise  no  modification  in  the 
marking  of  the  nodes  is  performed. 


%  LMT  Algorithm  for  True  Graphs  % 

Input:  The  following  input  data  is  assumed 

•  a  complex  formula  tp 

•  0’s  reduced  graphs  =  ({nij ,  •  •  -i  )»  *  *  ‘i 

•  a  set  of  sub-transaction  si ,  •  *  * ,  Sh 

•  variable  status  which  has  a  value  1  when  0  be¬ 
comes  true  in  the  current  state  of  a  federation, 

0  otherwise. 

Output:  value  of  status 

Procedure:  The  following  steps  are  performed 
for  each  i  =  l,m  do 

(1)  find  the  last  node  of  G'  marked  with 

ni 

(2)  find  the  next  node  of  ni  in  G*,  say  nj, 

(3)  evaluate  the  label  of  nj,,  say  0„t , 

against  si ,  -  • ,  s/,  * 

if  0„i  is  true 

V 

then 

(4)  mark(nj,)  =  -f 

(5)  find  the  next  node  of  nj,  in  G*,  say  ni 

(6)  if  label(n2)  =  true 

then 

(7-1)  status  =  1 
(7-2)  exit 

(8)  else  status  =  0 

The  above  LMT  algorithm  marks  the  nodes  of  a  re¬ 
duced  true  graph  in  a  linear  manner.  For  a  given  re¬ 
duced  graph,  the  step  (1)  finds  the  last  node  that  has  been 
marked  with  the  symbol  denoted  as  n^.  The  step  (2) 
evaluates  the  label  of  this  node  against  the  different  sub- 
transactions  that  have  been  issued  in  the  current  state  of 
a  federation.  This  evaluation  step  checks  whether  or  not 
the  sub-transactions  are  part  of  the  label  of  the  node  n^, 
and  then  evaluates  their  logical  combination  (depending 
on  the  complex  formula  0„j^  in  the  node  n{,).  If  the  for¬ 
mula  becomes  true,  then  the  next  step  consists  in  checking 
whether  the  True  node  has  been  reached,  meaning  that 
there  exists  at  least  one  reduced  graph  which  ha?  got  all 
its  nodes  marked  with  the  symbol  The  steps  (7-1) 

and  (7-2)  are  important  steps  of  the  LMT  algorithm  be¬ 
cause  they  avoid  further  marking  of  the  remaining  reduced 
graphs  when  at  leeist  one  of  the  reduced  graph  has  all  its 
nodes  marked  with  the  symbol  “-f-**. 

4.2  The  Zigzag  Marking  Technique 

Now  we  shall  introduce  the  remaining  algorithm  for  the 
true  graphs.  This  algorithm  is  based  on  a  non-linear  search 
of  a  node  that  has  not  been  marked  with  the  symbol  “-h”. 
In  contrast  to  the  linear  search  of  the  LMT  algorithm, 
the  ZMT  algorithm  retrieves  any  node  of  a  reduced  graph 
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when  the  label  of  this  node  can  be  matched  with  the  set 
of  sub-transactions  issued  in  the  current  state  of  the  fed¬ 
eration.  When  all  the  nodes  of  a  given  reduced  graph  are 
marked  with  the  symbol  the  algorithm  exits  in  the 
similar  way  as  LMT  because  the  formula  has  been  vali¬ 
dated  within  the  current  state  of  the  federation. 

%  ZMT  Algorithm  for  True  Graphs  % 

Input:  The  following  information  are  assumed 

•  a  complex  formula  (f> 

•  reduced  graphs  ),  ••• 

“  (  { >  *  *  *  t  }  )  of  <f> 

•  issued  sub-transactions  si ,  •  •  • , 

•  variable  status 

Output:  value  of  status 

Procedure:  The  following  steps  are  performed 
for  each  i  =  l,n  do 

for  each  j  =  1,  k  do 
if  mark(ni^)  =  ? 

(1)  evaluate  the  label  of  say 
ipm-,  against 

(2)  if  is  true 

(3)  then  mark(ny)  =  -|- 

(4)  check  all  the  labels  of  n.-j ,  •  •  • , 

(5)  if  all  labels  of  the  nodes  is  True 

(6)  then  exit 

Contrary  to  the  LMT,  the  above  algorithm  checks  all 
the  nodes  of  a  reduced  true  graph  which  have  as  labels. 
If  some  of  these  nodes  are  evaluated  to  true  against  the 
current  sub-transeictions,  then  these  are  marked  with  the 
label  “TVue”.  Thus,  the  algorithm  does  mark  (as  opposed 
to  the  LMT)  sevend  nodes  at  the  same  time  during  the 
processing  of  a  formula.  This  can  be  an  advantage  over 
the  LMT  algorithm,  however  the  complexity  of  the  ZMT 
increases  when  the  state  transition  graphs  become  large 
(i.e.,  n  becomes  very  important).  Finally,  the  complexity 
of  the  above  algorithm  is  o(n!  x  ct),  where  Ce  represents 
the  cost  of  matching  a  set  of  sub-transactions  against  a  set 
of  atomic  formulas.  The  ZMT  algorithm  is  useful  for  false 
graphs  but  not  for  true  graphs  (unless  the  number  of  nodes 
is  very  small)  because  every  node  of  a  true  state  transition 
graph  must  be  marked  [12]. 

5  Security  Agents 

This  section  addresses  the  design  of  a  secure  DOK  archi¬ 
tecture  to  support  the  framework  presented  in  the  previous 
sessions.  As  mentioned  earlier  in  this  paper,  DOK  [15]  is  a 
system  providing  federated  services,  such  as  a  reengineer¬ 
ing  service  [14],  guery  service  [11],  reflection  service  [2], 
trader  service,  security  service,  and  transaction  service. 
Each  of  these  federated  services  is  implemented  as  a  server 


and  it  is  used  by  different  processes  to  perform  specific 
functions,  such  as  retrieving  objects,  processing  queries 
over  a  set  of  databases,  etc. 

To  implement  the  different  functions  of  each  of  the  DOK 
services,  we  have  designed  a  logical  and  physicsd  architec¬ 
tures  aiming  to  support  interoperability  across  different 
database  platforms.  The  logical  architecture  describes  the 
DOK  layers  and  the  intra-  and  interaction  between  these 
layers  to  allow  efficient  communication  processing  of 
the  different  functions  of  the  DOK  services.  The  physical 
architecture  describes  the  implementation  of  the  different 
components  of  the  logical  architecture.  This  section  fo¬ 
cus^  on  the  DOK  logical  architecture. 

The  DOK  logical  architecture  is  based  on  the  use  of 
agents  [4]  to  perform  the  functions  of  the  DOK  services. 
Also,  these  agents  are  designed  to  be  able  to  understand 
and  abstract  (through  a  reflective  process)  information  em¬ 
bedded  within  different  applications  of  a  federation,  nego¬ 
tiate  with  remote  agents  to  perform  in  collaboration  Af¬ 
ferent  activities  of  a  federation,  etc.  In  thi^  section  we  will 
describe  the  DOK  security  agents  and  relate  their  activi¬ 
ties  to  the  algorithms  proposed  in  the  previous  sections. 

5.1  The  DOK  Agent  Model 

A  DOK  agent  refers  to  an  active  entity  which  performs 
specific  tasks  within  a  federation  such  as  enforcing  secu¬ 
rity,  ensuring  the  committing  of  global  transactions,  min¬ 
ing  resources  or  optimising  global  queries.  In  the  DOK 
agent  model,  an  agent  is  graphically  represented  with  (cir¬ 
cle/rectangle/etc)  icon  (see  Figure  10)  and  it  contains  dif¬ 
ferent  information,  such  as  the  name  of  the  agent,  a  set 
of  properties  (which  for  example  represents  the  informa¬ 
tion  required  for  enforcing  security  policies)  and  the  cor¬ 
responding  methods  or  functions  (that  is  the  procedures 
allowing  to  keep  a  federation  secure). 

The  interaction  between  agents  is  based  on  the  differ¬ 
ent  relationships  they  have  between  them.  The  stronger 
the  relationship  between  two  agents,  the  bigger  is  the 
commumcation  between  them.  Our  model  supports  three 
t3q5es  of  (commumcation)  relationships:  containment  re¬ 
lationship,  association  relationship,  and  inheritance  rela¬ 
tionship.  These  relationships  are  supported  by  most  of 
object-oriented  models  [10],  however  our  relationships  are 
related  to  d3fiiamic  issues  (e.g.  as  communication  between 
agents)  than  static  issues  (e.g.,  attributes  factorisation). 
More  importantly,  these  introduced  relationships  are  close 
to  those  that  exist  between  humans.  In  this  way,  contain¬ 
ment  relationships  are  parental  relationships  allowing  the 
expression  of  a  relationship  between  a  fathcr/mother  and 
their  children.  The  association  relationships  is  more-or-less 
like  neighbourhood  relationships.  They  express  some  sort 
of  relationship  between  human  agents,  however  they  are 
weaker  than  containment  relationships.  Inheritance  rela¬ 
tionships  express  a  kind  of  specialisation  of  human  agents 
to  perform  specific  tasks.  An  example  of  such  a  relation¬ 
ship  is  for  instance  the  relationship  between  a  staff  member 
and  a  teacher  (or  an  administrator).  A  staff  member  is  a 
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COORDINATION  SERVER 


Figure  10:  The  DOK  Logical  Architecture 


generic  agent  who  can  perform  some  fimctions,  however 
a  teacher  is  specialised  in  providing  lectiires  for  students 
in  the  same  way  that  an  administrator  is  specialised  in 
performing  administrative  work. 

Even  though  the  relationships  between  agents  look  like 
those  of  object-oriented  models,  they  are  different  in  un¬ 
derlying  semantics.  A  good  example  which  justiRes  this 
claim  is  the  following.  In  our  approach,  if  we  want  to  ex¬ 
press  a  relationship  between,  for  example,  an  agent  which 
represents  a  president  of  a  coimtry  and  its  minister  agents, 
we  will  use  the  containment  relationships  because  they  ex¬ 
press  the  sub-computations  that  the  minister  agents  are 
responsible  to  perform  on  behalf  of  the  president  agent. 
However,  in  object-oriented  data  models,  these  relation¬ 
ships  will  be  represented  as  association  relationships.  As 


we  will  see  later,  we  will  used  an  object-oriented  model  only 
to  represent  the  data  used  by  agents  during  the  commu¬ 
nication  process  (by  sending  for  example  an  object  iden¬ 
tity  of  an  object  located  in  a  given  site).  In  this  way, 
the  agent  model  describes  the  agent  information  and  be¬ 
haviour,  whereas  object-oriented  models  are  used  to  de¬ 
scribe  the  data  as  well  as  operations  which  can  be  applied 
or  used  by  agents. 

The  communication  between  a  father  agent  and  its  chil¬ 
dren  can  be  qualified  as  a  “strong  relationship”.  The 
“weaker  relationships”  are  modelled  as  associations.  They 
also  express  inter-dependencies  between  agents,  however 
they  are  related  to  a  mutual  collaboration  between  agents 
in  order  to  perform  one  or  multiple  common  functions. 
Referring  again  to  Figure  10,  the  association  relationship 
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between  the  DOK  Manager  md  the  Wrapper  agent  rtpre- 
sents  a  sort  of  collaboration  in  which  one  of  these  agents 
can  request,  for  example,  to  transform  a  schema  of  a  given 
database  in  order  to  be  imderstood  by  the  different  agents 
of  a  federation.  Agents  manipulate  data  which  are  in  the 
form  of  objects  [10].  In  this  way,  a  schema  of  a  rela¬ 
tional  database  will  be  transformed  into  an  object-oriented 
schema  by  using  one  of  the  functions  of  the  wrapper  pro¬ 
posed  in  [14].  Note  that  the  Wrapper  agent  may  commu¬ 
nicate  with  the  DOK  Manager  and  request,  for  instance, 
some  of  its  services. 

The  last  relationship  is  the  inheritance  relationship  be¬ 
tween  agents.  The  schema  of  Figure  10  does  not  have  one, 
however  a  simple  example  of  such  a  relationship  can  be 
defined  between  the  Wrapper  agent  of  Figure  10  and  an¬ 
other  new  agent  which  will  model  the  Wrapper  for  a  spe¬ 
cific  database,  e.g,  an  Oracle  database.  We  call  this  new 
agent,  an  WrapperOracle  agent,  and  this  will  be  like  any 
other  wrapper,  that  it  can  perform  any  of  the  functions 
of  an  ordinary  wrapper.  However  the  differences  with  the 
other  wrapper  agents  is  that  WrapperOracle  is  specialised 
for  Oracle  databases,  that  is  it  can  transform  (or  translate) 
only  schema  defined  with  an  Oracle  database.  In  the  same 
way,  this  wrapper  only  has  an  understanding  of  the  secu¬ 
rity  information  (e.g.  access  control)  of  Oracle  databases. 

5.2  The  Agents 

As  for  other  functions  of  the  DOK  system,  the  en¬ 
forcement  of  the  security  policies  by  the  DOK  agents  is 
a  multi-level  process.  At  the  top  level,  the  DOK  Manager 
or  Wrapper  agents  (depending  on  the  level  of  the  security 
checking)  identifies  the  type  of  task  to  perform  in  the  fed¬ 
eration  (e.g.  authentication).  The  agents  of  this  top  level 
are  aware  about  all  the  activities  which  are  happening  (or 
nlready  happened)  in  the  federation.  Also  they  can  access 
information  about  each  agent  (e.g.,  roles,  functions)  and 
request  other  agents  to  process  some  fimctions.  In  this 
way,  the  top  layer  of  the  DOK  involves  what  we  call  coor¬ 
dination  agents.  They  are  responsible  for  the  coordination 
of  all  activities  of  a  federation  rather  than  performing  the 
tasks  themselves.  These  tasks  (or  functions)  are  instead 
delegated  to  more  specialised  agents  of  lower  levels.  At 
the  middle  level,  specific  functions  (such  as  enforcement 
of  global  constraints  or  the  sanitisation  of  query  results) 
are  performed  by  what  we  call  security  agents.  Examples 
of  such  agents  are  the  Global  Security  Processor  or  Con¬ 
straint  Manager  of  Figure  10. 

Security  agents  differ  from  coordination  agents  because 
they  are  specialised  in  performing  specific  functions  of 
the  DOK  system.  Thus  the  security  agents  have  a  nar¬ 
row  visibility  of  a  federation,  that  is  they  have  knowledge 
only  about  agents  which  perform  the  same  function.  The 
bottom  level  is  comprised  of  a  set  of  agents  specialised 
in  accessing  or  updating  information  required  by  agents 
of  higher  layers.  These  agents  are  database-like  agents 
plajdng  the  role  of  an  interface  between  the  participating 
databases  and  the  agents  of  top  and  middle  layers.  They 


are  called  database  agents.  An  example  of  such  agents 
is  the  User  agent  which  records  all  information  about  a 
particular  user,  including  the  different  access  rights  that 
he/she  has  for  different  objects  as  well  as  the  identity  of 
the  user. 

This  three-layered  architecture  of  the  DOK  system  has 
many  advantages.  Each  layer  involves  a  set  of  agents 
which  are  responsible  for  a  certain  activities  in  a  feder¬ 
ation.  These  activities  can  be  management  activities,  that 
is  they  are  performed  by  coordination  agents  which  will 
oversee  the  running  of  a  federation.  Other  activities  will 
be,  for  example,  those  which  relate  to  a  specific  task.  These 
activities  are  performed  by  specialised  agents.  In  this  way, 
specialised  agents  have  a  narrow  view  of  the  whole  system. 
Finally,  the  last  activities  concern  more  simpler  functions 
which  are  related  to  the  “preparation”  for  instance  data  to 
be  used  either  by  specialised  or  coordination  agents,  the 
storage  of  data  related  to  objects  of  local  databases,  etc. 
This  classification  of  agents  based  on  their  activities  in  a 
federation  shows  a  sort  of  a  logical  clustering  (or  modules) 
of  the  DOK  agents.  Figure  10  iUustrates  the  different  DOK 
clusters,  that  is  the  coordination  server,  security  server, 
transaction  server,  query  server,  database  server,  etc. 

As  mentioned  earlier,  the  agents  of  the  different  DOK 
modules  interact  to  perform  the  required  functions  of  the 
DOK  system.  The  collaboration  between  the  agents  of  the 
same  cluster  (server)  is  larger  and  more  intense  that  those 
which  belong  to  different  clusters.  The  main  reason  for  this 
is  that  if  two  agents  are  involved  in  performing  the  same 
function,  they  should  be  related  by  stronger  relationships, 
i.e.  the  containment  relationship.  In  the  opposite  case, 
the  two  agents  will  be  involved  in  a  “weaker”  collabora¬ 
tion.  This  means  that  they  need  to  be  defined  in  different 
modules.  Since  this  article  is  only  related  to  security  en¬ 
forcement,  we  have  illustrated  in  Figure  11  the  interaction 
only  between  security  agents  in  order  to  enforce  both  local 
and  federated  security  policies. 


5.2.1  Coordination  Layer 

This  level  involves  a  set  of  agents  specialised  in  the  co¬ 
ordination  of  the  different  activities  of  a  federation.  With 
regard  to  security  enforcement,  the  DOK  Manager  and  the 
Wrapper  agent  are  the  only  agents  with  the  overall  under¬ 
standing  on  how  to  keep  a  federation  secure.  When  a  query 
is  sent  to  a  local  database,  the  wrapper  authenticates  the 
user  and  determines  whether  the  query  is  related  to  the  lo¬ 
cal  database  or  to  the  federation.  If  it  is  a  local  query,  the 
wrapper  will  delegate  the  processing  to  the  local  database. 
Otherwise,  the  query  is  translated  from  a  language  used  at 
the  local  level  into  a  global  level  [14]  and  then  processed 
by  the  the  DOK  Manager. 

WRAPPER  AGENT 

The  role  of  this  agent  is  to  translate  the  query  information 
from  the  local  level  to  the  global  level  and  vice  versa.  This 
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Figure  11:  Collaboration  between  DOK  Agents 


will  allow  the  DOK  Manager  and  other  agents  to  have  a 
better  understanding  of  information  contained  within  lo¬ 
cal  databases.  In  terms  of  security,  the  Wrapper  is  re¬ 
sponsible  for  authenticating  local  users  at  the  global  level 
before  presenting  the  query.  The  wrapper  agent  checks 
the  security  attributes  of  the  user,  and  relays  them  to  the 
DOK  Manager  for  enforcement  of  global  security.  This 
is  depicted  by  the  Component  Database,  User  and  Data 
agents,  and  the  relationships  among  these  agents  (see  Fig¬ 
ure  10).  The  methods  provided  by  the  Wrapper  agent 
are  authenticate juserfJ^  connectfjf  disconnect fj  and  trans- 
iatefj.  The  method  authenticate juserfJ  invokes  commu¬ 
nication  with  the  User,  Data  and  Component  Database 
agents  to  gather  information  about  the  types  of  access 
rights  a  user  has  to  a  certain  data  when  interrogating  other 
component  databeises. 

The  methods  connectQ  and  disconnectQ  establish  the 
communication  between  the  wrapper  and  the  federated 
level  by  connecting  or  disconnecting  a  database  from 
a  federation,  whereas  the  method  translateQ  will  do 
the  query  translation  from  the  local  to  the  global  level 
using  the  algorithms  provided  in  [14].  The  methods 
generate  access  Jist()  and  integrate  jaccessj’ightQ  allow 
the  mapping  and  the  integration  of  secmity  information 
of  local  databases  into  the  federated  level.  These  methods 
have  been  described  in  the  previous  section. 


DOK  MANAGER 

This  agent  allows  the  creation  of  DOK  instances  for  spe¬ 
cific  federated  environments.  The  DOK  Manager  contains 
an  attribute  name  (e.g.:  banking  or  schooling)  for  identi¬ 
fication  purposes.  Every  instance  of  this  agent  runs  on  a 
specific  federation  and  can  communicate  with  other  federa¬ 
tions.  Most  of  the  functions  of  the  DOK  Manager  agent  are 
performed  by  delegating  to  other  agents.  For  instance,  a 
DOK  Manager  ensures  the  completeness  and  correctness  of 
global  transactions  by  requesting  the  execution  of  methods 
of  the  Global  Transaction  Manager  and  the  Global  Query 
Processor,  More  importantly,  since  we  are  discussing  se¬ 
curity,  the  DOK  Manager  would  solicit  the  methods  of  the 
Global  Security  Processor  to  enforce  global  security.  Let 
us  consider  in  detail  the  security  procedures  of  the  DOK 
Manager.  When  a  request  is  sent  by  the  wrapper  to  the 
DOK  Manager  to  process  a  query,  as  shown  in  Figure  11, 
the  DOK  Manager  performs  the  foDowing  steps: 

•  Checks  whether  a  user  can  access  individual  aggre¬ 
gates  specified  in  the  query.  At  this  level,  the  global 
access  control  mechamsm  has  been  used  to  generate 
the  federated  access  list  for  each  aggregate  of  a  vir¬ 
tual  object.  The  methods  generatejaccessdisti)  and 
integratejacceasjrighti)  are  described  in  [13].  The 
DOK  Manager  requests  wrappers  to  generate  the  ac¬ 
cess  lists  for  each  local  aggregate  of  the  participating 
databases.  The  generated  access  lists  are  then  inte¬ 
grated  to  become  the  federated  access  list  of  the  ag- 
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gregate.  The  DOK  system  will  grant  or  deny  access 
or  update  of  the  information  to  the  user  according  to 
the  federated  access  list. 

•  When  a  user  is  effectively  allowed  to  access  or  up¬ 
date  individual  aggregates  specified  in  the  query,  the 
second  step  of  the  query  processing  consists  of  check¬ 
ing  whether  or  not  the  user  is  allowed  to  combine 
aggregates  to  derive  information  denied  to  him/her. 
At  this  stage  of  the  pre-processing  of  the  query,  the 
DOK  Manager  will  delegate  the  enforcement  of  the 
constraints  to  the  Constraint  Manager  which  will  look 
for  all  types  of  constraints  related  to  the  correspond¬ 
ing  user.  When  a  set  of  constraints  are  found,  the 
query  is  re-written  by  the  Query  Modifier  to  include 
the  constraints  within  the  query. 


5 .2 .2  Task  Layer 

At  this  level,  agents  perform  specific  tasks  to  ensure  that 
all  the  aspects  of  security  processing  are  carried  out  prop¬ 
erly  to  maintain  global  security.  The  tasks  of  maintain¬ 
ing  federated  security  policies  are  delegated  by  the  DOK 
Manager  to  specialised  agents  such  as  the  Global  Security 
Processor,  Query  Modifier,  Release  Database  Manager,  Re- 
lease  Database  agent  and  the  Response  Processor.  Here  we 
focus  on  the  agents  responsible  for  the  enforcement  of  ag¬ 
gregation  constraints. 

global  security  processor  (GSP) 

The  main  role  of  this  agent  is  to  assist  the  DOK  Man¬ 
ager  in  enforcing  federated  security  policies.  It  does  this 
via  methods  such  BS€nforceu:onstraints{),  build^graphQ, 
mark^raphsO,  and  sanitisejqueryjresults().  The  method 
chccA:^onairainfj// preprocesses,  where  possible,  the  secu¬ 
rity  aspects  of  a  global  query  before  they  are  sent  out  to 
the  ^ected  sites  for  execution.  This  does  not  include  con¬ 
straints  dependent  on  the  result  of  the  query,  such  as  con¬ 
straints  restricting  the  number  of  instances  of  an  object 
or  facet  that  a  subject  is  allowed  to  retrieve  in  a  query. 
The  sanitise jqueryjresultsQ  method  post-processes  the 
results  of  the  query  to  enforce  constraints  of  this  type. 
The  execution  of  each  of  the  methods  described  above  re¬ 
sults  in  the  communication  between  agents.  The  agents 
involved  include  the  Query  Modifier,  Response  Processor, 
Release  Database  Manager,  Release  Datab2Lse,  Constraint 
Manager,  Constraint  and  Wrapper  agents. 

The  remaining  methods  of  the  GSP  agent  relate  to 
the  enforcement  of  aggregation  constraints.  This  is  done 
through  the  triggering  of  the  method  check  j:onstraints{) 
which  is  delegated  to  the  Constraint  Manager. 

CONSTRAINT  MANAGER 

When  an  event  is  received  from  the  GSPagent,  the  Con¬ 
straint  Manager  performs  the  following  tasks: 

(i)  it  finds  appropriate  constraints  for  a  given  event  in  a 
federation  using  search jcon8traint8{)\ 


(ii)  it  builds  the  state  transition  graphs  for  the  selected 
constraints  using  generate^graph^).  This  method  is 
based  on  Algorithms  1  &  2  of  Section  3. 

(iii)  it  monitors  the  selected  constraints  using  the  meth¬ 
ods  LMT jn%ethod{)  and  ZMT^method{)  of  Section  4, 
If  any  violation  of  constraints  is  detected,  then  actions 
are  triggered  by  the  Constraint  agent. 

The  sanitisation  of  queries  is  delegated  by  the  GSP 
agent  to  the  Constraint  Manger.  The  processing  of  these 
specific  constraints  (i.e.  those  that  are  concerned  with  the 
samtisation  of  query  results)  is  performed  in  similar  to 
aggregation  constraints.  The  only  difference  between  the 
processing  of  aggregation  constraints  and  sanitisation  con¬ 
straints  is  that  the  former  is  performed  before  any  evalua¬ 
tion  of  the  user  query,  whereas  the  later  is  done  after. 

6  Concluding  Remarks 

In  the  area  of  distributed  databases,  much  work  h»tg 
been  focussed  on  providing  appropriate  access  control 
[6,  5].  However,  aggregation  in  distributed  databases  is 
currently  the  most  difficult  and  challenging  problem.  This 
deals  with  the  issue  of  inferring  data  classified  as  high  level 
(or  high  data)  from  some  set  of  data  classified  as  a  low  level 
(or  low  data)  [7].  That  is,  there  is  a  direct  inference  path 
(possibly  including  external  data)  from  the  low  data  to  the 
high  data. 

Existing  solutions  for  the  aggregation  problem  can  be 
classified  according  to  the  t3^e  of  inference  channel  to  be 
detected.  As  discussed  in  [7],  three  types  of  channels  can 
be  detected:  logical  inference  channels,  abductive  inference 
channels,  and  pro6a6i7i5fsc  channels. 

1.  Approaches  based  on  the  detection  of  iogiced  channels 
focus  on  the  construction  of  formal  deductive  proofs 
showing  the  existence  of  the  derivation  of  high  data 
from  a  low  data.  The  proposed  solutions,  mainly 
elaborated  by  the  researchers  of  the  Computer  Sci¬ 
ence  Laboratory  at  SRI  International,  deal  with  the 
use  of  formal  theorem  proving  for  detecting  inference 
chzumels  [9]. 

2.  A  slightly  weakened  requirement  for  a  logical  channel 
is  when  a  deductive  proof  may  be  not  possible,  but 
a  proof  could  be  completed  by  assumption  of  a  cer¬ 
tain  axioms.  The  development  of  an  abductive  proof 
takes  into  account  the  degree  to  which  a  user  is  likely 
to  know  some  facts  necessary  to  the  completion  of  a 
proof. 

The  approaches  for  the  detection  of  abductive  chan¬ 
nels  are  based  on  epistemic  logics  [3]  which  “relax”  the 
conventional  methods  of  reasoning  (theorem  proving) 
to  include  the  user’s  beliefs  as  a  part  of  the  model. 

3.  The  detection  of  probabilistic  channels  is  based  on 
the  inference  of  hi^  data  with  some  measure  of  belief 
greater  than  an  acceptable  limit.  Buezkowski,  in  [1], 
uses  a  probabilistic  model  (based  on  Bayesian  prob¬ 
ability)  to  estimate  security  risk  due  to  partial  infer¬ 
ence. 
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The  above  approaches  (except  the  Probabilistic  one)  are 
based  on  constructing  a  proof  (or  building  a  model,  that  is 
the  generation  of  all  possible  high  sensitive  data  [8]).  These 
approaches  are  formally  soimd  and  are  useful  in  detecting 
different  chcumels.  However  we  believe  that  they  are  lim¬ 
ited  in  their  use  for  large  scale  applications,  particularly  in 
distributed  environments. 

The  proposed  approach  has  advantages  and  disadvan¬ 
tages.  One  of  advantages  of  the  DOK  security  approach 
is  that  it  is  based  on  computational  issues  of  aggregation 
constraints  instead  of  building  formal  proofs.  Our  ap¬ 
proach  builds  appropriate  data  structures  (i.e.,  state  tran¬ 
sition  graphs)  to  model  the  different  sub-computations  of 
aggregation  constraints.  These  sub-computations  are  rep¬ 
resented  as  nodes  labelled  with  atomic  (temporal)  formu¬ 
las.  A  marking  technique  is  proposed  to  monitor  such  con¬ 
straints  depending  on  the  type  of  state  transition  graphs. 

The  limitations  of  the  DOK  approach  regarding  the  en¬ 
forcement  of  federated  security  policies  relate  to  the  com¬ 
plexity  of  the  construction  and  the  marking  of  state  tran¬ 
sition  graphs.  For  simple  aggregation  constraints,  the  pro¬ 
posed  approach  is  very  efRcient.  However  for  more  complex 
constraints,  appropriate  access  methods  are  needed  to  ac¬ 
cess  a  large  database  of  nodes  (of  state  transition  graphs) 
to  enable  efficient  enforcement  of  federated  security  poli¬ 
cies. 

We  are  currently  implementing  the  proposed  security 
framework  using  NEO  and  KQML  ^ .  Coordination  agents 
are  being  first  implemented  and  the  remaining  (security) 
agents  are  designed  based  on  the  algorithms  proposed  in 
previous  sections. 
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Abstract 

Internet  access  to  medical  data  has  greatly  facili¬ 
tated  information  sharing.  As  health  care  institutions 
become  more  willing  or  more  pressured  to  share  some 
of  their  protected  information,  tools  are  being  devel¬ 
oped  to  facilitate  the  information  transfer  while  pro¬ 
tecting  the  privacy  of  the  data.  To  this  end,  under  the 
TIHI  project,  we  have  designed  a  security  mediator,  a 
software  entity  that  screens  both  incoming  queries  and 
outgoing  results  for  compliance  with  a  medical  institu¬ 
tion's  policies  pertaining  to  data  privacy.  The  system 
IS  under  the  control  of  a  security  officer,  who  enters 
simple  rules  into  the  system  that  implement  the  poli¬ 
cies  of  the  institution.  In  this  paper,  we  describe  the 
implementation  of  the  security  mediator  dual 
interface.  The  customer  interface  allows  outsiders  to 
request  and  receive  filtered  medical  information  from  a 
hospital  database.  The  security  officer  interface  per¬ 
mits  rule  editing  and  resolution  of  cases  not  coveted 
by  the  rule -set. 

1  Introduction 

The  TIHI  project  [1,  2]  has  led  to  the  design  of  a 
software  system  which  allows  secure  sharing  of  medical 
information  over  the  Internet  [8].  It  is  designed  to 
support  interaction  with  collaborators,  rather  than  to 
prohibit  attack  by  foes.  Therefore,  it  is  best  used  in 
conjunction  with  more  defensive  security  techniques 
such  as  public/private  key  systems  or  firewalls. 

The  central  component  of  the  system,  the  security 
mediator,  is  a  gateway  between  a  medical  institution 
(e.g.,  a  hospital),  and  outsiders  (customers)  that  have 
a  legitimate  right  to  or  interest  in  the  institution’s 
medical  information.  Typical  customers  include: 

•  Public  Health  Agencies 

•  Medical  Researchers 

•  Community  or  Specialty  Physicians 

•  Insurance  Companies 


The  security  mediator  is  a  tool  that  belongs  to  the 
security  officer,  the  person  responsible  for  enforcing 
the  medical  institution’s  policies  concerning  patient 
data  security  and  privacy.  The  security  mediator  helps 
the  security  officer  enforce  these  policies  by  translat¬ 
ing  a  security  policy  into  a  set  of  rules.  These  rules 
belong  to  three  categories,  depending  on  whether  they 
affect  the  customer  himself  (setup  rules),  queries  sub¬ 
mitted  by  the  customer  (query  rules),  or  results  that 
follow  from  queries  (result  rules).  Setup  rules  verify 
the  customer’s  name  and  password,  and  restrict  the 
days  and  times  when  access  is  allowed  (i.e.,  a  billing 
clerk  may  not  be  allowed  to  access  the  system  on  week¬ 
ends).  Sample  query  rules  are  Check  Tables  (the  cus¬ 
tomer  is  restricted  to  specific  tables  in  the  database) 
and  Check  Select  (the  customer  is  limited  to  one  select 
statement  per  query).  The  most  important  result  rule 
is  Check  Dictionary,  which  checks  each  word  contained 
in  the  results  against  a  user-dependent  dictionary  to 
ensure  that  no  sensitive  textual  information  is  released 
to  the  customer. 

When  a  rule  violation  is  detected,  the  query,  the 
results,  or  both  are  sent  to  the  security  officer  for  re¬ 
view.  The  security  officer  can  either  approve  the  query 
as  is,  approve  an  edited  form  of  the  query,  or  approve 
a  filtered  set  of  results.  Results  checking  is  a  crucial 
augmentation  to  the  common  model  of  secure  access, 
in  which  no  further  validation  is  done  after  authenti¬ 
cation,  authorization,  and  certificate  issue  for  access 
rights.  In  practice,  results  checking  is  a  critical  step, 
because  the  organization  of  the  records  in  an  institu¬ 
tion  is  structured  to  deal  with  efficient  local  use,  not 
with  the  secure  matching  of  categories  to  external  ac¬ 
cess  rights. 

All  interactions  with  the  system  are  recorded  in  the 
Audit  Trail  database.  The  security  officer  can  use  the 
Audit  Trail  to  fine-tune  the  system.  For  example,  if  a 
customer  has  been  entering  queries  in  an  attempt  to 
circumvent  an  access  restriction,  the  security  officer 
can  force  all  of  the  customer’s  queries  to  be  reviewed 
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Figure  1:  Overall  Architecture. 


manually.  On  the  other  hand,  if  a  large  number  of 
safe  queries  get  bumped  to  the  security  officer  unnec¬ 
essarily,  he  can  relax  the  rules  to  allow  the  queries  to 
pass  without  manual  review.  Each  clique’s  dictionary 
can  be  incrementally  constructed  as  well,  with  words 
being  added  as  the  results  are  manually  approved  by 
the  security  officer. 

The  subsequent  sections  give  details  of  the  system 
architecture  and  the  WWW  implementation. 

2  Overall  Architecture 

The  security  mediator  regulates  access  to  database 
information  by  screening  customers,  queries  and  con¬ 
tents  of  results.  The  overall  architecture  is  described 
below  and  diagrammed  in  Figure  1. 

The  back-end  of  the  system  is  a  source  database 
containing  the  information  that  the  customers  are  in¬ 
terested  in.  Tables  from  this  database  could  include 
a  Patient  Demographics  Table,  a  Medical  History  Ta¬ 
ble,  and  a  Billing  Table.  This  information  resides  on  a 
central  computer  which  can  only  be  accessed  by  autho¬ 
rized  personnel  inside  the  medical  institution.  There¬ 
fore,  the  machine  need  not  be  multi-level  secure. 

Another  component  of  the  system  is  the  mediator 
database,  which  stores  the  User  Table  (containing  the 


usernames  and  passwords  of  registered  customers),  the 
Rules  Table  (containing  the  policy  rules  that  govern 
query,  and  result  screening),  and  the  Audit  Ta¬ 
ble  (a  record  of  all  transactions,  including  date,  time, 
user  identification,  queries,  results,  and  possible  rule 
violation  statements).  This  database  typically  resides 
on  a  Unix  workstation  protected  by  a  multi-level  se¬ 
curity  system. 

The  mediator  engine  sits  on  the  Unix  station  de¬ 
scribed  above.  It  consists  of  a  collection  of  executable 
routines  and  scripts  that  work  in  concert  to  access 
the  mediator  and  source  databases  in  response  to  cus¬ 
tomers’  queries  or  to  the  security  officer’s  input. 

Communicating  with  the  security  mediator  engine 
are  the  Web-based  customer  and  security  officer  in¬ 
terfaces.  The  customer  interface  allows  customers  to 
submit  queries  and  to  retrieve  results  from  remote 
sites  which  run  any  operating  system  that  supports 
a  WWW  connection.  The  security  officer  interface 
permits  rule  updates  and  audit  trail  look-ups.  The  se¬ 
curity  officer  need  not  operate  from  the  Unix  station 
that  holds  the  mediator  engine  and  database. 
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3  Implementation 

The  current  version  of  the  TIHI  prototype  has  been 
implemented  using  HTML  forms  and  CGI  scripts  to 
connect  the  front-end  Web  interfaces  with  the  internal 
databases.  The  architecture  consists  of  four  layers:  an 
HTML  forms  user  interface,  Perl  CGI  scripts,  C  rou¬ 
tines,  and  embedded  SQL  database  functions.  Details 
can  be  found  in  Figure  2.  The  interfaces  for  both  the 
customer  and  the  security  officer  are  Web-based,  and 
accessible  from  any  browser. 

3.1  Customer  Interface 

The  customer  interface  consists  of  three  screens: 
the  login  screen,  the  custom  medical  database  access 
screen,  and  the  result  screen.  Controlling  the  access 
and  result  screens  are  three  Perl  scripts:  the  Login 
Processor,  the  Query  Processor,  and  the  Result  Pro¬ 
cessor. 

The  Login  Processor  reads  the  username,  clique 
(membership  group),  and  password  from  the  login 
screen  (Figure  3),  and  retrieves  from  the  Rules  Table 
the  setup  rules  associated  with  the  customer’s  clique. 
It  then  cycles  through  the  relevant  rules,  calling  each 
rule’s  corresponding  C  routine.  Each  routine  returns 
a  Pass  or  Fail  flag.  If  a  rule  violation  is  detected, 
the  Login  Processor  generates  a  standard  error  screen 
(in  HTML)  and  returns  it  to  the  customer.  No  ex¬ 
planation  is  given  to  the  customer  as  to  why  the  lo¬ 
gin  failed,  since,  given  information,  the  customer  may 
be  able  to  circumvent  the  rule  that  restricted  access. 
If  all  the  setup  rules  pass,  the  customer  is  provided 
with  a  custom  database  access  screen,  which  imposes 
a  customer-dependent  type  of  query.  For  example,  a 
billing  clerk  would  be  prompted  to  enter  a  patient  ID 
number,  not  a  patient  name,  because  billing  clerks 
need  not  (and  probably  should  not)  know  patient 
names  in  order  to  perform  their  transactions  (Fig¬ 
ure  4).  Finally,  the  Login  Processor  records  a  suc¬ 
cessful  login  entry  into  the  Audit  Table. 

The  customer  then  enters  a  query  either  in  SQL  or 
by  filling  out  custom  forms,  depending  on  the  clique. 
For  example,  members  of  the  patient  clique  can  only 
request  their  own  record,  so  the  query  is  built  by 
the  mediator  using  the  patient’s  name.  Medical  re¬ 
searchers,  however,  are  allowed  to  enter  full  SQL  re¬ 
quests.  The  query,  as  well  as  information  about  the 
customer  and  the  clique,  is  sent  to  the  Query  Proces¬ 
sor  via  HTML  forms.  The  Query  Processor  then  ob¬ 
tains  the  pre-processing  (query  processing)  rules  asso¬ 
ciated  with  the  customer’s  clique,  and  cycles  through 
the  rules  in  the  same  manner  as  did  the  Login  Pro¬ 
cessor.  If  the  query  passes  all  relevant  rules,  then  the 
results  are  retrieved  and  processed  by  the  Result  Pro¬ 


cessor.  All  successful  queries  are  recorded  in  the  Au¬ 
dit  Table.  Unsuccessful  queries  are  sent  to  the  Review 
Queue  (explained  below). 

Successful  queries  cause  the  mediator  to  retrieve 
the  corresponding  results  and  to  screen  them  using  the 
Result  Processor.  Post-processing  (result  processing) 
rules  are  retrieved  from  the  Rule  Table  and  applied  to 
the  results.  If  no  rule  violation  occurs,  the  results  are 
presented  to  the  customer  in  HTML  tables  format.  A 
rule  violation  causes  the  query  that  yielded  the  results 
to  be  sent  to  the  Review  Queue. 

If  a  query  is  unsuccessful  because  of  a  rule  violation, 
an  entry  is  made  in  the  Audit  Table  section  called  the 
Review  Queue.  The  username,  clique,  query,  and  the 
rule  broken  are  all  stored  in  one  entry  of  the  Review 
Queue  (Figure  5).  The  Security  Officer  can  examine 
each  entry  and  decide  whether  the  query  should  be 
allowed.  The  Security  Officer  ha£  the  option  of  editing 
the  query  or  rejecting  it  altogether.  In  the  former  case, 
the  security  officer  edits  either  the  query  or  the  results 
(or  both),  and  sends  the  results  via  e-mail  to  the  query 
issuer  (Figure  6).  Otherwise,  the  customer  is  notified 
via  e-mail  that  the  request  was  rejected. 

3.2  Security  Officer  Interface 

The  Security  Officer  enforces  the  security  policies 
of  the  medical  institution  using  the  TIHI  system.  She 
builds  cliques  and  rule-sets,  monitors  system  usage, 
and  approves  or  rejects  queries  and  results  that  the 
Security  Mediator  disallowed. 

The  Security  Officer  HTML  interface  main  page 
gives  the  Security  Officer  a  choice  of  six  functions 
which  can  be  divided  into  two  categories:  system  mon¬ 
itoring  and  general  maintenance. 

System  Monitoring: 

•  Edit  Results:  The  Security  Officer  can  edit  unac¬ 
ceptable  results  of  queries  in  the  Review  Queue, 
and  either  send  the  filtered  results  to  the  customer 
or  reject  the  request  altogether. 

•  Edit  Query:  The  Security  Officer  can  either  edit 
unacceptable  queries  and  send  the  results  to  the 
customer,  or  reject  the  request  altogether. 

•  Edit  Dictionary:  The  Security  Officer  can  add 
to  or  delete  words  from  each  clique’s  dictionary. 
The  Edit  Clique  and  Edit  Dictionary  functions, 
used  in  conjunction  with  the  Review  Queue,  allow 
the  Security  Officer  to  refine  a  clique’s  rule-set 
and  dictionary  in  response  to  the  results  being 
requested. 
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Figure  2:  Details  of  Implementation. 


•  View  Audit  Trail:  The  Security  Officer  can  make 
a  custom  query  on  the  Audit  Trail  database  for  re¬ 
porting  and  investigative  purposes,  or  to  improve 
the  rule-set. 

General  Maintenance: 

•  Create  Clique:  The  Security  Officer  enters  the 
clique’s  name,  and  the  names  and  e-mail  ad¬ 
dresses  of  the  new  clique’s  users.  She  can  also 
choose  a  rule-set  for  the  new  clique  from  the  cat¬ 
alogue  of  rules  in  the  system. 

•  Edit  Clique:  The  Security  Officer  can  add  or 
delete  users,  add  to  or  delete  rules  from  the 
clique’s  rule-set,  or  change  the  parameters  for  the 
active  rules  in  the  clique’s  rule-set. 

•  Edit  User  Database:  The  Security  Officer  can  add 
to  or  delete  users  from  the  database.  If  a  deleted 
user  is  the  only  member  of  a  clique,  the  clique  is 
deleted  as  well. 

•  Edit  Default  Rules:  The  Security  Officer  can  add 
or  delete  rules,  and  change  the  parameters  for  the 
active  rules  in  the  default  rule-set. 


Throughout  the  login/query  entry /result  retrieval 
process,  the  activities  of  the  customer  and  any  inter¬ 
vention  by  the  Security  Mediator  are  recorded  in  the 
Audit  Table,  This  information  is  then  used  by  the 
Security  Officer  to  generate  reports  and  uncover  sus¬ 
picious  trends  in  the  use  of  the  system.  The  Mediator 
itself  uses  the  Audit  Table  to  retrieve  information  nec¬ 
essary  for  rule  application  (e.g.,  the  Last  Login  Time 
rule). 

4  Future  Implementation 

The  next  generation  TIHI  system,  which  is  under 
initial  development,  will  differ  from  the  current  proto¬ 
type  in  several  respects. 

First,  the  functionality  of  the  rule-set  will  be  in¬ 
creased,  Instead  of  the  static  collection  of  rules  cur¬ 
rently  used,  the  security  officer  will  enjoy  a  dynamic 
rule  environment.  A  rule  compiler  will  be  added,  that 
will  allow  the  Security  Officer  to  construct  rules.  For 
example,  suppose  a  Billing  Clerk  should  have  differ¬ 
ent  access  rights  depending  on  the  time  of  day.  The 
Check  Times,  Check  Tables,  and  Check  Columns  rules 
would  be  combined  to  create  a  rule  that  would  give  the 
Billing  Clerk  access  to  a  particular  set  of  tables  and 
columns  before  5  pm,  and  to  a  smaller  set  after  5  pm. 


Another  possible  improvement  would  be  to  port  the 
entire  system  to  Java.  A  Java  environment  would  al¬ 
low  for  greater  interactivity  in  both  the  customer  and 
Security  Officer  interfaces.  It  would  also  simplify  the 
underlying  structure  of  the  system,  shrinking  the  num¬ 
ber  of  layers  from  four  to  two  (Java  would  be  used 
both  for  the  back-end  routines  that  provide  database 
access  and  for  the  front-end  user  interface  screens  that 
provide  user  input). 

5  Summary  and  Conclusions 

The  TIHI  system  consolidates  the  security  needs  of 
an  institution's  database  system,  placing  the  burden 
on  the  Security  Mediator.  By  moving  the  security  el¬ 
ement  of  the  system  from  the  databases  themselves 
to  tlie  Security  Mediator,  we  have  accomplished  sev¬ 
eral  goals.  First,  we  have  created  a  solution  that  can 
manage  an  institution’s  data  sources  while  disregard¬ 
ing  its  .specific  physical  instantiation.  By  rigorously 
|)arsing  queries  and  filtering  results,  the  Security  Me¬ 
dial  or  is  able  to  overcome  security  holes  found  in  the 
niifl^rlying  data  organization  and  storage.  Second,  the 
S^  riiniy  .Mediator  serves  as  a  security  policy  imple- 
iii'  niat  ion.  designed  to  be  used  by  institutional  man- 
ai:**im*ni  rather  than  by  database  or  network  adminis- 
ira!or>  This  high-level  approach  places  the  control  of 
'  "fufiuifT-hased  data  resources  in  the  hands  of  those 
r- for  an  institution  s  information,  not  those 
r. -^j.. for  its  computers. 

I  h*  SfM'urity  Mediator  concept  is  not  limited  to  the 
h»  .ill  li-«  arf  domain.  It  is  applicable  wherever  there 
I-  •  •  •llahoration  between  different  user  domains  (ei- 
Ui*  r  within  an  institution,  or  between  institutions) 
and  access  rights  have  little  or  no  correlation 

ila  iiiKlrrlying  structure  of  the  data.  Military  and 
iiianufai'iuring  domains  are  potential  future  test-beds 
t.if  ^^  r-iinty  mediator  technology. 
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Figure  4:  Custom  DB  Access  Screen. 
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Figure  5:  Review  Queue. 
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Abstract 

Conflicts  in  database  systems  with  both  real-time  and  security  requirements  can  sometimes  be 
unresolvable.  We  attack  this  problem  by  allowing  a  database  to  have  partial  security  in  order  to 
improve  on  real-time  performance  when  necessary.  By  our  definition,  systems  that  are  partially 
secure  allow  security  violations  between  only  certain  levels.  We  present  the  ideas  behind  a  speci¬ 
fication  language  that  allows  database  designers  to  specify  important  properties  of  their  database 
at  an  appropriate  level.  In  order  to  help  the  designers,  we  developed  a  tool  that  scans  a  database 
specification  and  finds  all  unresolvable  conflicts.  Once  the  conflicts  are  located,  the  tool  takes  the 
database  designer  through  an  interactive  process  to  generate  rules  for  the  database  to  follow  dur¬ 
ing  execution  when  these  conflicts  arise.  We  briefly  describe  the  BeeHive  distributed  database 
system,  and  discuss  how  our  approach  can  fit  into  the  BeeHive  architecture. 
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1.  Introduction 

A  real-time  system  is  one  whose  basic  specification  and  design  correcmess  arguments 
must  include  its  ability  to  meet  its  timing  constraints.  This  implies  that  its  correctness  depends  not 
only  on  the  logical  correcmess,  but  also  on  the  timeliness  of  its  actions.  To  function  correctly,  it 
must  produce  a  correct  result  within  a  specified  time,  called  deadline.  In  these  systems,  an  action 
performed  too  late  (or  even  too  early)  may  be  useless  or  even  harmful,  even  if  it  is  functionally 
correct  [16].  If  timing  requirements  coming  from  certain  essential  safety-critical  applications 
would  be  violated,  the  results  could  be  catastrophic. 

Traditionally,  real-time  systems  manage  their  data  (e.g.  chamber  temperature,  aircraft 
locations)  in  application  dependent  structures.  As  real-time  systems  evolve,  their  applications 
become  more  complex  and  require  access  to  more  data.  It  thus  becomes  necessary  to  manage  the 
data  in  a  systematic  and  organized  fashion.  Database  management  systems  provide  tools  for  such 
organization.  The  resulting  integrated  system,  which  provides  database  operations  with  real-time 
constraints  is  generally  called  a  real-time  database  system. 

In  many  of  these  applications,  security  is  another  important  requirement,  since  the  system 
maintains  sensitive  information  to  be  shared  by  multiple  users  with  different  levels  of  security 
clearance.  As  more  and  more  of  such  systems  are  in  use,  one  cannot  avoid  the  need  for  integrating 
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them.  Not  much  work  has  been  reported  on  developing  database  systems  that  support  both 
requirements  of  multilevel  security  and  real-time.  In  this  paper,  we  address  the  problem  of  sup¬ 
porting  both  requirements  of  real-time  and  security,  based  on  the  notion  of  partial  security. 

1.1  Real-time  Database  Systems 


Real-time  database  systems  extend  the  set  of  correctness  requirements  from  conventional 
database  systems.  Transactions  in  real-time  systems  must  meet  their  timing  constraints,  often 
expressed  as  deadlines,  in  order  to  be  correct.  In  stock  market  applications  and  automated  facto- 
nes,  a  poor  response  time  from  the  database  can  result  in  the  loss  of  money  and  property.  In  many 
real-time  database  systems,  transactions  are  given  priorities,  and  these  priorities  are  used  when 
scheduling  transactions.  In  most  cases,  the  priority  assigned  to  a  transaction  is  directly  related  to 
the  deadhne  of  the  transaction.  For  example,  in  the  EarUest  Deadline  First  scheduling  algorithm, 
transactions  are  assigned  priorities  that  are  directly  proportional  to  their  deadlines;  the  transaction 
with  the  closest  deadline  gets  the  highest  priority,  the  transaction  with  the  next  closest  deadline 
gets  die  next  highest  priority,  and  so  on.  One  important  goal  of  a  real-time  transaction  scheduler  is 
to  minimize  or  eliminate  the  number  of  priority  inversions  -  situations  where  a  high  priority 
u-ansacuon  is  forced  to  wait  for  a  lower  priority  transaction  to  complete.  As  we  shall  see  below,  it 
IS  this  goal  that  comes  in  conflict  with  security  requirements. 


1.2  Multilevel  Secure  Database  Systems 

Multilevel  secure  database  systems  have  a  set  of  requirements  that  are  beyond  those  of 
conventional  database  systems.  A  number  of  conceptual  models  exist  that  specify  access  rules  for 
transacuons  in  secure  database  systems.  One  important  such  model  is  the  Bell-LaPadula  model 
I  I.  n  this  model,  a  secunty  level  is  assigned  to  transactions  and  data.  A  security  level  for  a  trans¬ 
action  represents  Its  clearance  level;  for  data,  the  security  level  represents  the  classification  level 
Transacuons  are  forbidden  from  reading  data  at  a  higher  security  level,  and  from  writing  data  to  a 

lower  secunty  level  If  these  rules  are  kept,  a  transaction  cannot  gain  direct  access  to  any  data  at  a 
higher  secunty  level.  ^ 

However,  system  designers  must  be  careful  of  covert  channels.  A  covert  channel  is  an 
indirect  means  by  which  a  high  security  clearance  process  can  transfer  information  to  a  low  secu¬ 
rity  clearance  process  [7].  If  a  transaction  at  a  high  security  level  collaborates  with  a  transaction  at 
a  lower  secunty  level,  infonnation  could  flow  indirectly.  For  example,  say  that  transaction  T^ 
wished  to  send  one  bit  of  information  to  transaction  T^.  T^  has  top  secret  clearance,  while  T.  has  a 
lower  clearance.  If  T^  wishes  to  send  a  “1”,  it  locks  some  data  item  previously  agreed  upon.  (This 
data  item  could  be  one  that  is  created  specifically  for  this  covert  channel  by  Ta.)  If  T^  wishes  to 
send  a  “0”,  it  does  not  lock  the  data  item.  Then,  when  T,,  tries  to  read  the  data  item  Lid  finds  it 
locked.  It  knows  that  Tj  has  sent  a  “1”;  otherwise,  it  knows  that  Ta  has  sent  a  “0”.  Covert  channels 
may  use  the  database  system’s  physical  resources  instead  of  specific  data  items 

One  sure  way  to  eliminate  covert  channels  is  to  design  a  system  that  meets  the  require¬ 
ments  of  non-interference.  In  such  a  system,  a  transaction  cannot  be  affected  in  any  manner  bv  a 
transacuon  at  a  higher  security  level.  In  other  words,  a  subject  at  a  lower  access  class  should  not 
be  able  to  disunguish  between  the  outputs  from  the  system  in  response  to  an  input  sequence 
including  actions  from  a  higher  level  subject  and  an  input  sequence  in  which  all  inputs  at  a  higher 


access  class  have  been  removed  [7].  For  example,  a  transaction  must  not  be  blocked  or  preempted 
by  a  transaction  at  a  higher  security  level. 

1.3  Integration  of  Real-time  and  Security  Requirements 

The  requirements  of  real-time  systems  and  those  of  security  systems  are  often  in  conflict 
^3-  Frequently,  priority  inversion  is  necessary  to  avoid  covert  channels.  Consider  a  transaction 
with  a  high  security  level  and  a  high  priority  entering  the  database.  It  finds  that  a  transaction  with 
a  lower  security  level  and  a  lower  priority  holds  a  write  lock  on  a  data  item  that  it  needs  to  access. 
If  the  system  preempts  the  lower  priority  transaction  to  allow  the  higher  priority  transaction  to 
execute,  the  principle  of  non-interference  is  violated,  for  the  presence  of  a  high  security  transac¬ 
tion  affect^  Ae  execution  of  a  lower  security  transaction.  On  Ae  oAer  hand,  if  Ae  system  delays 
Ae  high  priority  transaction,  a  priority  inversion  occurs.  The  system  has  encountered  an  unresolv- 
able  conflict.  In  general,  Aese  unresolvable  conflicts  occur  when  two  transactions  contend  for  Ae 
same  resource,  wiA  one  transaction  having  boA  a  higher  security  level  and  a  higher  priority  level 
Aan  Ae  oAer.  Therefore,  creating  a  database  Aat  is  completely  secure  and  strictly  avoids  priority 
inversion  is  not  feasible.  A  system  Aat  wishes  to  accompUsh  Ae  fusion  of  multi-level  security  and 
real-time  requirements  must  make  some  concessions  at  times.  In  some  siAations,  priority  Aver¬ 
sions  might  be  allowed  to  protect  Ae  security  of  Ae  system.  A  oAer  siAations,  Ae  system  might 
allow  covert  channels  so  Aat  transactions  can  meet  Aeir  deadlAes. 

There  are  oAer  factors,  besides  security  enforcement,  Aat  could  degrade  Ae  timelAess  of 
Ae  database  system.  For  example,  transient  overload  or  failure  of  certaA  components  could 
impact  Ae  system  performance.  However,  regardless  of  Ae  reason  for  impaired  timelAess,  relax¬ 
ing  security  requirements  always  provide  a  positive  impact  on  Ae  system  performance. 


1.4  Our  Approach 

Our  approach  to  Ais  problem  of  conflictAg  requirements  Avolves  dynamically  keeping 
track  of  bo  A  Ae  real-time  and  Ae  security  aspects  of  Ae  system  performance.  When  Ae  system  is 
perforrning  well  and  makAg  a  Wgh  percentage  of  its  deadlAes,  conflicts  Aat  arise  between  secu- 
nty  and  real-time  r^uireraents  will  tend  to  be  resolved  in  favor  of  Ae  security  requirements  more 
often,  and  more  priority  inversions  will  occur.  However,  Ae  opposite  is  tme  when  Ae  real-time 
performance  of  the  system  starts  to  degrade.  Then,  Ae  scheduler  will  attempt  to  elimAate  priority 
inversions,  even  if  it  means  allowAg  an  occasional  covert  channel. 

Semantic  information  about  Ae  system  is  necessary  when  making  Aese  decisions.  This 
information  could  be  specified  before  Ae  database  became  operational  using  a  specification  lan¬ 
guage.  A  Ais  language,  users  would  be  able  to  express  Ae  relative  importances  of  keeping  infor¬ 
mation  secure  and  meeting  deadlAes.  Specifications  A  Ais  language  could  Aen  be  “compiled”  by 
a  pre-processing  tool.  After  a  success A1  compilation,  Ae  system  should  be  deterministic  A  Ae 
sense  Aat  an  action  must  be  clear  for  every  possible  conflict  Aat  could  arise.  This  action  might 
depend  on  Ae  current  level  of  real-time  performance  or  oAer  aspects  of  Ae  system.  Any  ambigu- 
lUes  would  be  caught  at  compile  time,  causing  Ae  compilation  to  be  unsuccessful.  The  compila¬ 
tion  of  Ae  specification  produces  output  Aat  can  be  understood  and  used  by  Ae  database  system 

The  problem  of  accomplisWng  Ae  umon  of  security  and  real-time  requirements  becomes 
more  complicated  A  a  distributed  environment.  A  a  distributed  environment,  having  a  single 
entity  keep  track  of  system  performance  in  terms  of  timelAess  and  security  for  Ae  entire  global 
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daubase  could  be  impracUcal  for  a  number  of  reasons.  Requiring  transactions  to  report  to  this 
performance  monitor  after  every  execution  could  put  more  load  on  the  network  and  have  a  nega- 
T  The  node  that  contained  the  performance  monitor  would  be  a 

hotspot  and  might  mtroduce  a  perfoimance  botUeneck.  These  problems  would  be  serious  as  the 
system  got  bigger,  so  such  a  solution  would  have  a  limited  scalability.  This  brings  up  an  interest- 
ing  question:  Is  it  better  to  have  many  performance  monitors,  each  responsible  for  a  small  part  of 
the  database,  or  to  have  fewer  of  them,  each  with  a  larger  domain?  In  other  words,  what  granular¬ 
ity  of  the  system  should  the  performance  monitors  be  responsible  for?  In  our  approach,  there  is  a 
perform^ce  momtor  responsible  for  every  node.  One  of  the  issues  to  be  addressed  in  a  system 
with  multiple  performance  momtors  is  how  to  optimize  the  database  globally  with  only  local 

owledge.  In  our  approach,  this  is  accomplished  through  communication  between  performance 
momtors  at  each  node.  ^ 

In  the  next  section,  we  describe  some  related  works  in  the  areas  of  specifying  real-time 
and  secure  requirements  for  database  systems,  distributed  security  models,  and  some  previous 
work  m  combinmg  the  requirements  of  real-time  and  secure  database  systems.  Section  3  describes 
the  partial  secunty  policy,  the  ideas  behind  the  specification  language,  and  the  tool  to  analyze  the 
language.  In  Section  4,  we  describe  how  our  approach  will  fit  into  Beeffive,  a  distributed  database 

real-time,  security,  quality  of  service,  and  fault-tolerance  requirements.  Section  5 
concludes  the  paper  with  a  discussion  of  future  work. 


2.  Related  Work 

.  T**®*^®  ^ork  on  specifying  security  requirements.  One  approach  is  using  the 

SnsSn^N'^^i?'  ^  ""“ber  of  secLy 

Stilus  cirsffffh^H^K^^^^  constraints  to  con- 

database  dependmg  on  the  content  or  security  level  of  data.  Constraints 
c^  also  depend  on  real-world  events,  information  that  has  been  previously  released,  and  can  clas- 

other  cons  Jnts.  THe  curren'e^n 

nL^  tn  c  ^  specification,  but  eventually,  we 

need  to  support  a  complete  secunty  constraint  classification  for  each  appUcation. 

real-time  requirements  also  exist.  For  example,  a  real-time 

thk  “  presented  in  [8].  The  fundai^ental  building 

block  in  this  design  is  caUed  an  atomic  activity.  This  specification  system  does  employ  some 

ver  techniques  to  group  and  relate  these  atomic  activities  through  graphs.  An  activity  which  is 
defined  ^  a  sec  of  compnudons,  can  be  viewed  as  a  dansaction.  LLc  acUvia“i"  gli^n  fse“ 
imn'^mn!^?  that  include  name,  precondiUons,  postcondiUons,  pieemptability,  state^variables, 
portonce  level,  timmg  constramts,  resource  requirements,  and  behavior.  The  activities  are  also 
properties,  such  as  arrival  time,  ready  time,  scheduling  deadline,  completion  dead- 
hne,  execution  time,  starting  time,  and  completion  time.  Our  model  for  the  specification  of  real¬ 
time  properties  was  mfluenced  by  these  methods,  and  is  probably  closest  to  the  model  for  the 
alT!  However,  some  of  the  properties  used  in  that  model  were  not  necessary  in  ours, 

and  we  needed  to  add  a  couple  of  properties  not  present  in  the  atomic  activity  model. 

rnnn,.  attempts  to  define  security  protocols  in  distributed,  object-oriented  envi¬ 

ronment.  Two  examples  are  Legion  [15]  and  CORBA  [3].  However,  we  are  not  aware  of  any  pre¬ 
vious  attempts  to  satisfy  both  secunty  and  real-time  requirements  in  a  distributed,  object-oriented 
environment.  George  and  Haritsa  studied  the  problem  of  combining  real-time  L  secui% 
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requirements  [5].  They  examined  real-time  concurrency  control  protocols  to  identify  the  ones  that 
can  support  the  security  requirement  of  non-interference.  This  work  is  fundamentally  different 
from  our  work  because  they  make  the  assumption  that  security  must  always  be  maintained  In 
their  work,  it  is  not  permissible  to  allow  a  security  violation  in  order  to  improve  on  real-time  per¬ 
formance. 


3.  Specification 

In  this  section,  we  first  outline  the  approach  to  defining  partial  security.  We  then  provide 
the  details  of  specifying  different  rules  for  the  database  system. 

3.1  Partial  Security 

As  explained  above,  our  approach  will  at  times  call  for  a  violation  of  security  in  older  to 
uphold  a  timeliness  requirement  When  this  happens,  the  system  will  no  longer  be  completely 
secure;  rather,  it  will  only  be  partially  secure.  One  of  the  major  research  questions  to  be  addressed 
is  to  identify  quantitative  partial  security  levels  and  to  develop  methods  for  making  trade-offs  for 
real-time  requirements.  Traditionally,  the  notion  of  security  has  been  considered  binary.  A  system 
is  either  secure  or  not.  A  security  hole  either  exists  or  not.  The  problem  with  such  binary  notion  of 
security  is  that  in  many  cases,  it  is  critical  to  develop  a  system  that  provides  an  acceptable  level  of 
security  and  risks,  based  on  the  notion  of  partial  security  rather  than  unconditional  absolute  secu¬ 
rity,  to  satisfy  other  conflicting  requirements.  In  that  regard,  it  is  important  to  define  the  exact 
meaning  of  partial  security,  for  security  violations  of  confidential  data  must  be  strictly  controlled. 
A  security  violation  here  indicates  a  potential  covert  channel,  i.e.,  a  transaction  may  be  affected 
by  a  transaction  at  a  higher  security  level. 

One  approach  is  to  define  security  in  terms  of  a  percentage  of  security  violations  allowed. 
However,  the  value  of  this  definition  is  questionable.  Even  though  a  system  may  allow  a  very  low 
percentage  of  security  violations,  this  fact  alone  reveals  nothing  about  the  security  of  individual 
data.  For  example,  a  system  might  have  a  99%  security  level,  but  the  1%  of  insecurity  might  allow 
the  most  sensitive  piece  of  data  to  leak  out.  For  serious  secure  database  applications,  a  more  pre¬ 
cise  metric  would  be  necessary. 

A  better  approach  involves  adapting  the  Bell-LaPadula  security  model  and  blurring 
boundaries  between  security  levels  in  order  to  allow  partial  security.  In  this  scheme,  only  viola¬ 
tions  between  certain  security  levels  would  be  allowed.  As  the  real-time  performance  of  the  sys¬ 
tem  degrades,  more  and  more  boundaries  can  be  blurred,  allowing  more  security  violations  and 
reducing  the  number  of  security  conflicts.  Since  there  are  less  conflicts,  this  can  improve  the  real¬ 
time  performance  of  the  system.  Additionally,  with  this  scheme,  we  can  still  make  guarantees 
about  the  security  of  the  data.See  Figure  1  for  an  example.  Here,  we  are  considering  a  system  with 
four  security  levels:  top  secret,  secret,  confidential,  and  unclassified.  In  Figure  la,  the  system  is 
completely  secure.  Figures  lb  through  Id  show  systems  that  are  partially  secure,  progressing 
from  more  secure  to  completely  insecure.  Solid  lines  between  security  levels  indicate  that  no  vio¬ 
lations  are  allowed  between  the  levels;  dashed  lines  indicate  that  violations  are  allowed.  For 
example,  in  Figure  lb,  transactions  that  are  at  the  unclassified  level  may  have  conflicts  with  trans¬ 
actions  at  the  confidential  level  in  accessing  to  unclassified  data,  resulting  in  a  potential  covert 
channel. 

It  is  possible  to  combine  this  approach  with  the  use  of  percentages  to  define  partial  secu- 
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rity.  Then,  the  amount  of  security  violations  between  two  levels  for  which  the  boundary  had  been 
blurred  would  be  required  to  fall  below  this  percentage.  The  above  example  is  really  a  special  case 
of  this  scheme,  where  levels  can  either  be  0%  or  100%.  Note  that  no  guarantees  can  be  made 
between  levels  that  have  been  assigned  a  non-zero  percentage.  Guarantees  can  still  be  made 
between  levels  designated  as  allowing  0%  security  violations;  for  the  other  levels,  database 
esigiiers  can  use  different  percentages  to  denote  their  preferences  on  where  they  would  rather 
have  the  potential  security  violations  occur. 


Figure  1  ~  Partial  Security  Levels 


For  certainapphcations  in  which  absolute  security  is  required  for  safety-critical  appUca- 
uons,  any  trade-offs  of  security  for  timeliness  must  not  be  allowed.  The  idea  of  partial  security 
discussed  in  this  paper  cannot  be  used  in  such  applications.  Even  if  partial  security  is  acceptable 
o  an  app  cauon,  the  system  designer  should  be  careful  in  identifying  the  conditions  under  which 
It  might  be  dangerous  to  compromise  the  security.  For  example,  some  sort  of  denial  of  service 
c  cou  orce  the  system  into  a  condition  where  timeliness  constraints  are  not  satisfied  The 
system  can  limit  the  potential  damage  by  setting  up  rules  that  can  identify  the  situation  and  take 
appropriate  acfions,  if  necessary.  For  example,  the  system  may  audit  the  possible  covert  channels 
and  log  ^y  activity  that  might  be  exploring  the  channel.  The  rules  can  utilize  the  notion  of 

encrypted  profile  to  either  look  for  patterns  of  Ulegal  access  or,  alternatively,  to  certify  a  good  pat¬ 
tern  of  access.  j  & 


3.2  Specification  Methods 

Application  designers  should  be  able  to  specify  semantic  information  using  a  specification 
language  to  express  the  relative  importance  of  keeping  desired  level  of  security  and  meeting  tim¬ 
ing  constraint  requirements.  A  question  to  be  addressed  in  that  approach  is  the  verification  of  the 
given  specification.  Specifications  should  be  compUed  and  verified  to  check  any  inconsistency  in 
the  requirements  and  to  clearly  determine  the  necessary  actions  to  be  taken.  We  developed  a  spec- 
ihcauon  language  that  allows  designers  to  generate  rules  at  varying  levels  of  detail.  We  have  also 
developed  a  tool  to  analyze  the  specification  to  identify  any  inconsistency  and  produce  semantic 
inforraauon  and  rules  that  will  be  maintained  by  the  database  system.  The  approach  to  specifying 
the  security  ^d  real-time  requirements  is  a  pre-processor  that  aids  the  database  designer  first  with 
locating  conflicts  and  then  with  denoting  their  preferences  according  to  the  semantics  of  the  data- 
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base.  First,  we  will  give  the  details  of  the  specification  language.  We  will  then  go  through  a  few 
examples  to  illustrate  the  ideas. 

3.2.1  Specification  Language 

Our  specification  language  allows  designers  to  create  rules  at  varying  levels  of  detail.  In 
applications  where  much  information  is  known  about  the  database  beforehand,  designers  can  con- 
ttol  security  and  real-time  aspects  of  the  database  much  more  tightly  than  in  situations  where  less 
is  known  beforehand  or  such  a  tight  control  is  not  required.  There  are  three  levels  of  detail  in  this 
specification  scheme.  Note  that  one  system  can  use  rules  from  all  three  levels  if  needed. 

The  specification  consists  of  two  parts:  a  description  of  the  database  and  a  set  of  rules  to 
follow  when  conflicts  arise.  TTie  description  provides  a  framework  for  the  rules.  As  we  shall  see 
below,  the  specification  of  both  the  description  and  the  rules  varies  between  the  different  levels  of 
details.  Regardless  of  the  levels  of  details  that  are  used,  the  first  part  of  the  specification  contains 
facts  about  the  database  as  a  whole.  Here,  designers  specify  the  number  of  data  items,  the  number 
of  security  levels,  and  the  number  of  priority  levels  used  in  the  entire  database. 

In  the  first,  most  detailed  level,  designers  can  make  rules  for  specific  transactions.  Trans¬ 
actions  are  given  a  number  of  components.  Each  transaction  is  given  a  readset  and  a  writeset. 
These  can  consist  of  any  number  of  data  items.  If  no  readset  or  writeset  is  given,  they  are 
assumed  to  be  empty.  The  real-time  requirements  of  a  transaction  are  given  by  four  variables: 
priority,  execution  time,  release  time,  and  periodicity.  The  periodicity  of  a  transaction  defines 
how  often  it  starts  executing,  and  the  release  time  indicates  the  offset  of  the  periodic  start  Finally, 
transactions  are  given  a  security  level. 

Information  about  data  can  also  be  specified.  Data  items  are  specified  by  number,  and 
each  data  item  is  given  a  security  level.  The  specification  can  also  contain  a  default  security  level, 
which  is  assigned  to  any  unspecified  data  items.  All  of  this  information  about  transactions  and 
data  belong  in  the  description  portion  of  the  specification. 

Not  all  of  these  components  for  transactions  and  data  items  are  required.  In  general  pur¬ 
pose  database  systems,  some  of  the  information  might  be  hard  to  specify.  However,  in  many  real¬ 
time  applications,  most  information  is  available,  since  such  information  is  necessary  for  schedula- 
bility  analysis  of  the  system  to  support  the  timeliness  and  predictability  requirements.  In  fact,  in 
real-time  database  systems,  many  transactions  are  periodic  and  their  access  pattern  is  known.  The 
only  truly  necessary  components  are  the  security  level  and  the  priority  level.  If  a  designer  leaves 
out,  for  example,  the  readset  or  the  writeset,  the  preprocessor  tool  (discussed  below)  cannot  make 
any  assumptions  about  the  data  accessed  by  this  transaction,  so  it  must  assume  that  the  transaction 
may  conflict  with  every  other  transaction. 

Next,  the  database  designer  comes  up  with  rules  that  define  the  actions  that  the  system 
must  lake  when  the  transactions  conflict.  These  rules  can  either  be  static  or  dynamic.  Static  rules 
apply  to  conflicts  that  are  resolved  in  the  same  way  every  time.  For  example,  the  user  might  spec¬ 
ify  that  a  conflict  between  two  specific  transactions,  or  two  categories  of  transactions,  will  never 
result  in  a  security  violation. 

Dynamic  rules  can  depend  on  certain  run-time  variables  that  the  database  keeps  track  of 
during  execution.  Currently,  dynamic  rules  can  be  based  on  three  different  dynamic  variables: 
security  violation  percentage,  transaction  miss  percentage  (the  percentage  of  transactions  that 
have  missed  their  deadlines),  and  the  number  of  consecutive  missed  deadlines.  Each  dynamic  rule 
has  a  list  of  clauses  and  a  default  action.  A  clause  contains  a  boolean  relation  (>,  >=,  =,  <,  or  <=) 
between  one  of  these  three  dynamic  variables  and  a  constant  value.  Each  clause  also  cont^ns  the 
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action  (either  violate  security  or  violate  priority)  to  be  taken  if  the  boolean  relation  is  true.  When  a 
conflict  IS  encountered  by  the  database  system,  it  checks  the  first  clause.  If  that  clause  is  true,  it 
takes  the  associated  action.  If  not,  it  checks  the  next  clause.  If  none  of  the  clauses  turn  out  to  be 
true,  the  datobase  takes  the  default  action.  For  example,  a  rule  might  be  “If  the  security  violation 
percentage  is  greater  than  5,  violate  security.  Otherwise  violate  timeliness.”  Here,  the  “otherwise” 
sentence  represents  the  default  action. 

In  a  distributed  environment,  when  conflict  occur  between  transactions  executing  at  differ¬ 
ent  nodes,  the  action  taken  may  need  to  depend  on  the  performance  at  all  the  nodes.  In  that  case, 
rules  should  be  created  that  take  into  account  the  statistics  on  every  nodes  involved  in  the  transac- 
uon.  Every  rule  should  have  a  “partner”  rule,  covering  this  contingency.  This  rule  might  also  take 

mto  account  the  latency  between  the  nodes  at  which  the  conflicting  transactions  are  being  exe¬ 
cuted.  ® 


The  second  level  of  specification  detail  replaces  specific  transactions  with  categories  of 
transacuons.  Transactions  are  categorized  by  their  security  levels  and  priority  levels.  The  designer 
can  create  any  number  of  categories  at  any  granularity  that  he  or  she  feels  is  appropriate,  and 
descnbes  these  categorizations  in  the  description  portion  of  the  specification.  Then,  rules  are  cre¬ 
ated  for  conflicts  between  categories  of  transactions.  These  rules  are  the  same  as  the  rules  for  the 
nrsi  level. 


e  Aird  level  of  specification,  designers  create  a  set  of  rules  describing  actions  to  take 
in  c^e  of  conflicts.  Here,  the  conflicts  are  not  specific;  the  same  rule  set  is  consulted  for  every 
conflict.  Conditions  would  depend  on  the  characteristics  of  the  transactions  that  are  conflicting  or 
the  cuirent  prformance  statistics.  Depending  on  the  results  of  the  comparison,  the  rule  would 
ra^date  either  a  secunty  violauon  or  a  pnority  violation.  All  of  this  information  belongs  in  the 
rules  portion  of  the  specification;  nothing  is  needed  in  the  description  portion. 

By  carefully  creating  the  rules,  database  designers  can  implement  the  partial  security 
scheme  described  m  the  previous  section.  As  with  many  other  aspects  of  designing  these  rules,  a 
tool  can  help  designers  carefully  model  their  partial  security  system 

^d  rules  for  these  detail  levels  can  be  mixed.  In  this  case,  when  the  database  encounters  a  conflict 

to  see  if  a  level  1  rule  applies.  If  not,  it  searches  the  level  2  rules, 
and  finally  checks  the  level  3  rules. 


3.2.2  Examples 

Figure  2  shows  an  example  of  a  system  completely  specified  with  detail  level  1.  This  is  a 
small  example,  with  only  two  transactions.  Every  relevant  component  of  these  transactions  has 
been  specified.  Both  transacuons  access  data  item  45,  and  ComputeAverage  writes  to  it  so  we 
have  a  pojenud  conflict.  Since  ComputeAverage  has  both  a  lower  security  level  and  a  lowJr  prior¬ 
ity  level  than  SampleTransaction,  this  conflict  cannot  be  resolved  without  causing  either  a  covert 
channel  or  a  pnonty  inversion.  Had  ComputeAverage  been  given  a  higher  priority  than  Sample- 
ransacuon,  we  can  sausfy  both  requirements  by  allowing  ComputeAverage  to  preempt  Sample- 
Transaction.  Alternatively,  if  ComputeAverage  had  a  higher  security  level  Aan 
SampleTransacUon,  then  both  requirements  could  be  satisfied  by  forcing  ComputeAverage  to  wait 

for  Sample  transaction.  As  will  be  seen  in  the  next  section,  the  task  of  locating  such  conflicts  can 
be  automated. 

There  are  two  rules  for  this  conflict  -  the  local  rule  and  the  non-local  rule.  In  the  rule 
specification,  SecViolation%  indicates  the  percentage  of  security  violations  and  Trans¬ 


it 


Description: 

nximDataltems  5; 
numSecurityLevels  4; 
numPriorityLevels  4; 

data [default] .security  =  1; 
data [3] .security  =  2; 

ComputeProfit.readset  =  1,  2,  3,  4; 

ComputeProfit .writeset  =  5; 

ComputeProf it .periodicity  =  12; 

ComputeProfit .priority  =  3; 

ComputeProfit . security  =  3; 

UpdatePrice. writeset  =  3;  #  Two  transactions  access  data  item  3. 

UpdatePrice. periodicity  =  30; 

UpdatePrice . security  =  2; 

UpdatePrice. priority  =  2; 

Rule  for  ComputeProfit-UpdatePrice  conflict: 

{SecViolation%  >=  5)  -  violateTimeliness , 

(TransMiss%  >10)  -  violateSecurity, 

(otherwise)  -  violateTimeliness ; 

( (LocTransMiss%  <=  15)  &  (RemTransMiss%  <=  10))  ~  violateTimeliness, 
((LocSecViolation%  <  10)  &  (RemSecViolation  <  10))  -  violateSecurity, 
(otherwise)  violateTimeliness; 

Figure  2  -  Example  of  specification  with  fully  specified  detail  level  1 


Miss -6  indicates  the  percentage  of  deadline  miss  ratio.  Each  rules  consists  of  a  condition  and  a 
decision.  The  condition  part  of  a  rule  is  stated  inside  the  parenthesis  and  foUowed  by  the  decision 
after  tilde  (~).  Conditions  can  be  connected  by  logical  AND  (&)  of  OR  (I).  In  the  local  rule,  the 
first  line  represents  a  security  crisis.  If  more  than  5  percent  of  transactions  have  violated  security, 
dien  this  transaction  cannot  afford  to,  so  it  must  violate  timeliness.  If  the  condition  in  the  first  line 
is  false,  the  condition  in  the  next  line  is  checked.  This  line  represents  a  real-time  crisis.  If  more 
than  10  percent  of  transactions  have  missed  their  deadUnes,  then  the  real-time  performance  is  suf¬ 
fering,  so  this  transaction  must  violate  security.  Again,  if  the  condition  in  this  second  line  is  false, 
the  next  line  is  checked.  Here,  this  line  is  the  “catch-all”  rule.  If  none  of  the  above  rules  apply,  the 
database  is  instructed  to  violate  timeliness. 

The  non-local  rule  operates  in  much  the  same  way.  The  first  line  in  this  rule  represents  a 
state  in  which  the  real-time  performance  of  the  system  is  at  an  acceptable  level  either  locaUy  or  at 
the  remote  site.  If  either  condition  is  satisfied,  the  database  is  instructed  to  violate  timeliness.  If 
the  real-time  performance  is  not  at  an  acceptable  level,  the  system  checks  the  second  line  of  the 
rule  to  determine  if  the  security  of  the  system  is  acceptable.  If  so,  it  violates  security;  if  not,  it 
moves  on  to  the  third,  “catch-all”  line  and  violates  timeliness. 
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.....  u.„  conmci  oeiween  the  specific  transactions  is  specified  in  the  same  manner 

caircones’’ TheTar^”'’'''  "'fa'*''  1“  specification  for  conflicts  between  two  transaction 
egones.  These  also  are  specified  in  the  same  manner. 

This  specific  level  2  rule  is  also  an  example  of  a  static  rule  -  every  time  that  transactions 

liolationsT  uphold  security.  RuTerfor 

hasf  rir  transactions  and  transaction  categories  can  be  specified,  tf  the  data- 

,  so  desires.  Finally,  we  see  a  rule  set  for  detail  level  3.  If  none  of  the  rules  in  level  1 

cnns'^h  ^  ^  conflict  encountered  by  the  database,  it  determines  the  course  of  action  by 

ulung  this  ruleset.  Again,  these  are  specified  in  the  same  manner,  with  the  exception  that  a 
ouple  of  new  vanables  can  be  used.  The  variable  priorityLevelDifference  represents^ the  differ 
Sy  tevcfs'  securityUvclDifference  does  the  same  for  sau- 

In  Figure  4  we  give  an  example  of  a  rule  that  deals  with  multiple  conflicts  This  mle  is 
mterpmted  much  Idte  Ae  level  3  rules.  In  the  first  line,  the  database  has  Lwrf  a  Wgh  numte  o“ 
security  violations  m  the  past,  so  the  rule  commands  the  database  to  execute  the  tiafsaction  with 
dte  lowest  secumy  m  order  to  avoid  all  covert  channels.  The  second  line  deals  wfthTdamb^ Iw 
has  allowed  too  many  transacnons  to  miss  their  deadlines;  here,  the  database  will  execute  the 
transacuon  w.th  the  highest  priority.  If  the  database  does  not  have  a  real-time  0^;^^^ 
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then  the  transaction  that  has  been  waiting  the  longest  will  execute., 

3.3  Tool  Implementation 

When  the  pre-processor  executes,  the  description  portion  of  the  specification  is  read  and 
stored  in  internal  data  structures.  The  processor  checks  for  syntax  errors  and,  if  no  errors  are 
found,  it  analyzes  the  specification  and  finds  all  potential  conflicts  between  the  security  and  real¬ 
time  requirements.  For  completely  specified  level  one  descriptions,  in  order  for  two  transactions 
to  conflict,  the  following  must  be  true: 

1. They  must  both  access  the  same  data  item. 

2.  At  least  one  of  the  transactions  must  write  to  the  data  item. 

3. One  transaction  must  be  at  a  higher  security  and  priority  level  than  the  other. 

4.The  execution  times  of  the  transactions  must  intersect. 

Every  pair  of  transactions  that  satisfy  these  conditions  are  reported  to  the  user.  Of  course, 
in  less  detailed  descriptions,  not  all  of  these  rules  apply.  For  example,  if  the  readset  or  writeset  of 
one  of  the  transactions  is  left  unspecified,  then  the  first  two  rules  do  not  apply.  If  the  timing  infor¬ 
mation  is  incomplete  for  one  of  the  transactions,  the  last  rule  does  not  apply.  For  level  2  catego¬ 
ries,  all  categories  might  conflict,  so  every  possible  pair  of  categories  is  reported  to  the  designer. 

The  user  then  goes  through  an  interactive  process  to  create  rules  that  capture  the  require¬ 
ment  for  the  databases  actions  when  these  conflicts  are  encountered.  For  each  conflict,  the  pre¬ 
processor  advises  the  user  about  the  implications  of  violating  security  with  regard  to  the  scheme 
of  partial  security  described  above.  For  example,  in  the  case  of  a  four  level  secure  database,  if  a 
conflict  occurs  between  transactions  at  the  top  secret  level  and  the  unclassified  level,  allowing  a 
security  violation  would  force  the  database  into  the  situation  of  Figure  Id. 

Armed  with  this  information,  the  user  now  creates  the  rules  for  the  database  to  follow  dur¬ 
ing  execution.  Rules  are  created  as  explained  above.  The  rules  for  detail  level  3  are  also  inputted 
now.  Note  that  since  level  3  rules  do  not  require  any  entries  into  the  description  portion  of  the 
specification,  a  database  that  contains  rules  only  of  level  3  will  not  use  the  description  analyzer 
stage  of  the  tool.  Once  the  user  has  fimshed  providing  the  rules,  the  pre-processor  verifies  that  it 
can  determine  an  action  to  take  in  any  possible  situation.  If  this  is  not  the  case,  the  tool  finds  and 
reports  the  weakness  in  the  specification.  When  the  specification  has  no  remaining  weaknesses, 
the  pre-processor  creates  an  output  file  that  contains  the  choices  of  the  user.  This  file  will  be  refer¬ 
enced  by  the  database  during  system  execution. 


4.  Functioning  in  a  Distributed  Environment 

In  order  to  examine  the  distributed  properties  of  this  system  further,  we  put  it  into  the  con¬ 
text  of  the  BeeHive  system,  which  is  a  distributed  database  system  being  designed  with  require¬ 
ments  beyond  those  of  real-time  and  security.  First,  we  give  an  overview  of  the  BeeHive  system. 


(SecViolation%  >=10)  -  executeLowestSecurity, 

{TransMiss%  >  10)  ~  executeHighestPriority, 

(otherwise)  -  executeOldestTransaction; 

Figure  4  -  Example  of  rule  for  conflicts  of  three  or  more  transactions. 
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Figure  5  -  Design  of  a  native  BeeHive  site 
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Md  then  we  present  how  our  approach  fits  into  the  Beeffive  architecture.  Note  that  we  use  Bee- 

BeeHive  setting  to  implement  our  approach.  The  actual  security  subsystem 

eeHive  can  be  different  from  the  approach  we  present  in  this  paper. 


4.1  BeeHive  Overview 

JilT  University  of  Virginia[10]  is  an  attempt  to  build  a  global  vir- 

atabase  with  real-ume,  secunty,  fault-tolerance,  and  quality  of  service.  The  BeeHive  system 

tems3??  sjtes,  legacy  sites  ported  to  BeeHive.  and  interfaces  to  legacy  sys- 

tems  outside  of  BeeHive.  For  the  purposes  of  this  paper,  we  will  focus  on  the  native  Beeffive 

Figure  5  shows  the  basic  design  of  a  native  BeeHive  site.  At  the  appUcation  level  users 
FW  analysis  programs,  general  programs,  and  access  audio  and  video  data 

each  of  these  acuviues  the  user  has  a  high  level  specification  interface  for  real-time  OoS 
fault  toler^ce,  and  security.  As  transactions  (or  other  programs)  access  objects,  those  objects’ 
become  acUve  and  a  mappmg  occurs  between  the  high  level  requirements  specification  and  the 
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object  API  via  the  mapping  module.  This  mapping  module  is  primarily  concerned  with  the  inter- 
faee  to  object  wrappers  and  with  end-to-end  issues.  A  novel  aspect  of  the  work  is  that  each  object 
has  semantic  information  (also  called  reflective  information  because  it  is  information  about  the 
object  itself)  associated  with  it  that  makes  it  possible  to  simultaneously  satisfy  the  requirements  of 
time,  QoS,  fault  tolerance,  and  security  in  an  adaptive  manner.  For  example,  the  information 
might  include  rules  or  policies  and  the  action  to  take  when  the  underlying  system  cannot  guaran¬ 
tee  the  deadline  or  level  of  fault  tolerance  requested.  This  semantic  information  also  includes  code 
that  makes  calls  to  the  resource  management  subsystem  to  satisfy  or  negotiate  the  resource 
requirements.  The  resource  management  subsystem  further  translates  the  requirements  into 
resource  specific  APIs  such  as  the  APIs  for  the  OS,  the  network,  the  fault  tolerance  support  mech¬ 
anisms,  and  the  security  subsystem. 

The  resource  manager  of  BeeHive,  referred  to  as  the  “BeeKeepef”,  is  the  central  entity  of 
the  resource  management  process.  The  main  function  of  the  BeeKeeper  is  the  mapping  of  service- 
specific,  possibly  qualitative,  QoS  requirements  into  actual,  quantitative,  resource  requests.  The 
BeeKeeper  contains  an  Admission  Controller,  a  Resource  Manager,  and  a  Resource  Allocation 
Module.  The  Admission  Controller  decides  whether  BeeHive  has  sufficient  resources  to  support 
the  requirements  of  a  new  transaction  without  compromising  the  guarantees  made  to  currently 
active  transactions.  The  Resource  Allocation  Module  is  responsible  of  managing  the  interface  of 
BeeHive  to  underlying  resource  management  systems  of  BeeHive  components.  The  Resource 
Planner  attempts  to  globally  optimize  the  use  of  resources.  The  Admission  Controller  of  the  Bee- 
Keeper  merely  decides  whether  a  new  application  is  admitted  or  rejected.  Obviously,  such  a 
binary  admission  control  decision  leads  to  a  greedy  and  globally  suboptimal  resource  allocation. 
The  Resource  Planner  is  a  module  to  enhance  the  admission  control  process  and  to  yield  globally 
optimal  resource  allocations. 

4.2  Supporting  Real-time  and  Partial  Security  in  BeeHive 

Along  with  the  security  and  real-time  requests  of  the  transactions,  the  mapper  conveys  the 
identity  of  the  transaction  and  its  timestamp  to  each  of  the  objects  that  it  invokes.  The  timestamp 
is  neeessary  to  identify  this  specific  instance  of  the  transaction.  This  information  is  stored  with  the 
other  semantic  information  of  the  object,  and  is  conveyed  to  the  Resource  Manager  through  the 
APIs. 

The  real-time  and  security  APIs  allow  the  objects  being  used  by  transactions  to  convey 
their  requirements  to  the  resource  manager.  In  all  cases  besides  rules  dealing  with  detail  level  1, 
this  is  all  the  information  about  the  transaction  needed  by  the  resource  manager  to  make  decisions 
when  conflicts  arise.  However,  in  detail  level  1,  the  resource  manager  needs  to  be  aware  of  the 
identity  of  the  transaction  for  which  the  object  is  executing.  This  information  can  be  conveyed 
through  either  the  security  or  the  real-time  API. 

The  admission  controller  will  be  a  natural  choice  for  the  agent  that  detects  conflicts  that 
require  the  violation  of  either  real-time  or  security  requirements:  i.e.,  a  conflict  between  a  high 
priority,  high  security  transaction  and  a  low-priority,  low  security  transaction.  All  other  conflicts 
are  easy  to  resolve,  and  can  be  handled  by  the  Admission  Controller.  However,  for  these  special 
conflicts,  the  decision  is  delegated  to  the  Resource  Planner;  this  is  the  entity  that  contains  and  exe¬ 
cutes  the  rules  created  by  the  database  designer. 

When  a  transaetion  encounters  a  conflict,  the  Admission  Controller  decides  (perhaps  after 
consulting  the  Resource  Planner)  which  of  the  two  transactions  is  allowed  to  continue  execution 


and  which  must  be  delayed.  The  delayed  transaction  is  placed  in  a  queue  associated  with  the  exe- 
cuung  transacuon.  Once  that  transaction  is  finished,  the  delayed  transaction  may  begin  execution. 
K  two  transacuons  are  waiting  on  the  queue,  the  Admission  Controller  consults  the  rule  covering 
this  conflict  ^d  allows  one  of  the  transactions  to  proceed.  If  more  than  two  transactions  are  on  the 
queue,  and  the  rules  do  not  support  the  execution  of  one  of  the  transactions  over  all  the  other 
transac^ns,  then  the  Admission  Controller  must  consult  the  rule  that  deals  with  this  situation. 

The  performance  monitor  fits  best  into  the  Resource  Allocation  Module.  This  module  is 
c  osest  to  the  resources  that  the  statistics  are  representing.  The  feedback  on  resource  usage  that 
the  Resource  Mocation  Module  provides  to  the  Resource  Planner  is  useful  for  other  Beeffive 
unctions,  such  as  for  QoS  and  fault  tolerance  requirements.  For  our  purposes,  the  Resource  AUo- 

cauon  Module  must  keep  track  of  the  percentage  of  transactions  that  have  committed  a  security 
violauon  or  missed  a  deadline. 

As  we  have  seen,  when  conflicts  occur  between  nodes,  the  action  taken  can  depend  on  the 
performance  at  both  nodes.  Therefore,  some  sort  of  cooperation  and  exchange  of  statistics  must 
occur  between  the  resource  managers  of  the  nodes.  In  the  BeeHive  model,  the  resource  manageis 
at  different  nodes  should  communicate  with  each  other,  this  will  be  necessary  not  only  for  our 
purposes,  but  also  for  the  resource  reservation  necessary  for  QoS  guarantees. 

nn.im  glance,  this  scheme  seems  to  locally  optimize  the  database,  rather  than  globally 

rr.hJ’V  examined  more  closely,  this  node-by-node  optimization  may  be 

preferable  to  a  global  optimizauon.  Consider  a  database  with  ten  nodes.  In  eight  of  these  nodes 
ihe  secunty  requirements  have  been  upheld  but  the  real-time  performance  has  started  to  degrade’ 

•'"O  "»<*“•  Now,  a  conflict  occurs  between  these  last  two 
riiv  in  h^i  optimized,  the  resource  managers  might  decide  to  violate  secu- 

real-time  performance  of  the  system.  This  decision  will  have  tittle  effect 
lurthnr  th  otmance  of  the  eight  nodes  whose  real-time  performance  is  degrading,  and 

unher  the  security  problems  on  the  two  nodes  where  security  violations  are  a  problem. 

5.  Conclusions 

In  this  paper,  we  have  presented  mechanisms  to  allow  the  union  of  security  and  real-time 

partial  secunty.  The  definition  allows  secunty  violations  in  order  to  improve  real-time  perfor- 
^a^^hL^d  compromise  the  security  of  the  entire  dataLe  system.  However, 

fnr  violations  between  transactions  whose  security  levels  dif- 

levei  ^  transactions,  say,  at  the  highest  and  lov^st  security 

cIs,  no  partid  s^unty  remains  in  the  system  at  all.  In  a  system  with  many  such  conflicts,  U 

may  e  very  ficult  to  improve  on  real-time  performance.  However,  it  is  essential  that  the  system 
dcs  gner  can  specify  how  to  manage  the  system  security  and  real-time  requirements  in  a^con- 
trolled  manner  in  real-world  applications. 

1.V.I  ^  to  create  rules  at  whatever 

level  of  detail  that  they  feel  is  appropriate.  These  rules  can  then  be  analyzed  by  a  tool,  which 

ows  designers  to  create  a  database  and  easily  make  conscious  decisions  about  the  parti^  secu- 
my  of  the  database.  T^e  tool  can  also  automates  the  process  of  scanning  through  the  complex 
dependencies  of  a  database  specification  to  find  conflicts.  It  then  informs  the  user  of  the  coLe- 
quences  of  violating  security  for  each  conflict. 

Currently,  we  have  a  tool  that  can  analyze  transactions  completely  specified  in  detail  level 
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1.  This  tool  parses  a  database  description,  analyzes  the  dependencies  and  conflicts,  and  then  goes 
through  an  interactive  process  with  the  user  to  create  rules  for  all  possible  conflicts.  Our  future 
work  includes  extending  this  tool  to  handle  rules  and  descriptions  of  levels  2  and  3.  We  are  also 
developing  a  simulator  to  investigate  the  performance  of  a  database  that  uses  the  output  of  this 
tool,  analyzing  the  effects  of  different  choices  made  by  the  user  of  the  tool. 
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Abstract 

The  satisfaction  of  confidentiality  demands  in  multi-level  logic-based  databases 
requires  a  distortion  of  the  database’s  intended  model.  This  work  focuses  on  distor¬ 
tions  incurred  by  changes  in  a  database’s  name-space,  ie  its  signature.  We  define 
extensions  to  a  universal  name-space  which  preserve  a  database’s  static  and  dynamic 
semantics.  They  are  useful  for  achieving  a  high  degree  of  confidentiality.  The  exten¬ 
sions  can  be  embedded  into  the  partial  order  of  the  security  levels,  which  yields  a  set 
of  hierarchical  name-spaces.  This  hierarchy  can  be  used  to  reflect  both  the  sharing 
of  protection  units  across  security  levels  and  the  demands  to  keep  them  confidential. 

1  Introduction 

A  database  is  generally  regarded  as  an  image  of  a  given  section  of  the  real-world.  The  various 
elements  of  the  database  are  defined  in  such  a  way  that  they  can  be  identified  with  many  impor¬ 
tant  elements  of  the  real-world  section.  In  particular,  a  logic-based  database  can  represent  enti¬ 
ties  of  the  real-world  section,  properties  of  entities,  snapshots  of  the  state  of  the  properties  and 
invariants  of  the  snapshots.  It  can  also  track  changes  in  the  real-world  section  by  changing  the 
snapshots.  All  these  aspects  are  covered  by  the  theory  of  open  logic-based  databases. 

A  secure  database  must  meet  additional  availability,  integrity  and  confidentiality  demands.  The 
word  additional  indicates  that  a  secure  database  is  an  extension  to  an  open  database  and  must  not 
be  defined  in  contravention  of  open  databases’  principles.  Analogous  with  the  conception  of  open 
databases,  a  secure  database  should  represent  an  image  of  a  secure  real-world  section. 

Confidentiality  demands  arise  in  many  situations  of  the  daily  life.  Some  are  dictated  by  the  law 
and  others  are  stated  by  a  company  or  a  person  to  protect  its  or  his  interests.  Referring  to  a 
particular  real-world  section,  an  explicit  confidentiality  demand  names  an  element  of  its  as  the 
object  of  confidentiality  and  a  person  from  whom  this  object  should  be  kept  confidential. 

TTiis  work  deals  with  the  meaning  and  the  enforcement  of  confidentiality  demands  whose  objects 
of  confidentiality  are  entities  of  the  real-world  section.  It  is  a  continuation  of  the  work  of  Spalka/ 
Cremers  (1996). 

In  the  real  world  adults  seem  to  need  little  advice  on  how  to  keep  an  entity  secret  from  another 
person.  limited  only  by  their  imagination  there  cire  undoubtedly  countless  ways.  One  way  which 
a  database  can  handle  in  an  elegantly,  relies  on  the  naming  of  the  entities.  Let  us  consider  a  short 
example. 

Suppose  person  A  has  written  a  report  which  should  be  kept  secret  from  person  B.  A  names  this 


1.  In  books  on  databases  or  on  logic,  eg  Cremers/Griefahn/Hinze  (1994)  or  Barwise  (1991),  usu¬ 
ally  the  word  object  is  used  here.  However,  in  the  field  of  security,  an  object  denotes  a  protection 
unit.  To  avoid  confusion,  we  have  decided  to  use  tiie  word  entity  instead  -  though  we  are  aware 
it  is  overloaded  in  other  fields. 
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report  R.  First  of  all,  A  hides  the  report  from  5  in  a  safe  place.  As  an  additional  protection  meas¬ 
ure  A  can  choose  a  name  R,  which  is  not  an  element  of  B’s  language.  If  B  has  no  command  of,  eg, 
Russian  or  Chinese  but  A  does,  then  A  can  name  the  report  acajioda  or  fi'J.  Then  B  can  neither 
name  this  report  in  a  question  nor  assign  the  same  name  to  one  of  his  reports.  This  situation  has 
at  least  two  advantages.  With  respect  to  the  first  aspect,  A  Avill  never  have  to  lie  to  B  about  the 
secret  report.  With  respect  to  the  second,  B  can  never  -  depending  on  the  naming  rules  -  cause 
a  naming  confusion  or  a  naming  conflict. 

The  purpose  of  this  example  is  to  manifest  that  differing  name-spaces  are  a  valuable  means  for 
the  enforcement  of  confidentiality  demands  referring  to  entities.  On  the  other  hand,  it  is  clear 
that  this  example  is  not  intended  to  be  directly  implemented  in  a  secure  database.  Well,  there  are 
database  products,  eg,  Microsoft  Access,  that  can  work  with  many  languages  in  addition  to  Eng¬ 
lish,  even  with  Russian  and  Chinese.  In  view  of  the  prospective  applications  of  secure  databases, 
the  real  problem  is  that,  though  possible,  it  is  not  that  simple  to  teach  these  languages  to  the 
database  users. 

Structured  name-spaces  represent  a  more  appropriate  modelling  of  this  example.  Before  we 
introduce  our  approach  in  databases,  let  us  take  a  brief  look  at  two  prominent  cases  in  which 
structured  name-spaces  are  already  in  use:  directory-structured  file  systems  and  the  handling  of 
classified  documents.  In  the  first  case,  the  name  of  a  file  has  the  form  (a,  b)  ,ais  the  name  of  a 
directory  and  b  is  the  name  of  the  file  local  to  this  directory.  The  wish  to  group  related  files  and 
the  need  for  a  large  name-space  without  long  local  names  were  among  the  original  aims  of  adopt¬ 
ing  a  directory  structure.  The  usefulness  of  directories  for  the  administration  of  access  rights  was 
soon  realised,  eg,  the  assignment  of  a  home  directory  to  each  user.  In  the  second  case,  the  name 
of  a  classified  document  has  also  the  form  (a,  b) .  Here,  a,  the  classification,  is  a  security  level  and 
b  is  any  name.  The  documents  a  person  can  read,  create  and  modify  are  determined  by  compar¬ 
ing  his  clearance,  which  is  also  a  security  level,  with  the  document’s  classification.  In  both  cases, 
the  components  of  a  name  (a,  b)  play  the  same  roles.  The  name  b  is  an  element  of  a  universal 
name-space  21* ,  ie  a  string  of  characters  over  an  alphabet  21  that  comprises  the  representable 
characters,  eg,  the  letters  of  the  English  alphabet.  Since  each  directory  and  each  security  level 
has  its  own  name-space,  21*  is  a  local  name-space.  The  component  a  represents  a  name-space 
selector.  Lastly,  the  set  of  all  possible  names  (a,  b)  constitutes  the  global  universal  name-space. 
With  respect  to  the  confidentiality  of  an  entity  with  the  name  (a,b),it  the  selector  a  is  unavailable 
to  a  user  u,  then  u  cannot  address  (a,  b)  nor  can  he  create  an  entity  with  the  name  (a,  b) .  Thus 
this  constellation  supports  the  efforts  to  keep  the  existence  of  {a,  b)  secret  from  u. 

Let  us  now  turn  our  attention  to  databases.  First  of  all,  we  would  like  to  stress  that  logic-based 
databases  offer  an  excellent  framework  for  the  study  of  the  meaning,  the  properties  and  the 
effects  of  new  database  elements.  Logic  is  today  a  well-understood  formal  instrument  and  the 
notion  of  logic  model  is  a  succinct  and  elegant  description  of  data  semantics.  From  a  practical 
perspective,  logic-based  databases  comprise  also  relational  databases  -  as  soon  as  the  desirable 
formulae  are  established  it  is  easy  to  get  down  to  tables  and  rows.  Lastly,  and  this  is  very  impor¬ 
tant,  this  approach  encourages  a  clear  separation  of  declarative  and  procedural  concepts. 

Tlie  name  space  of  a  logic-based  database,  ie  its  language,  is  determined  by  its  signature.  Specific 
to  databases,  the  signature  comprises  an  infinite  set  of  only  constant  function  symbols  and  a  finite 
set  of  predicate  symbols.  The  function  symbols  represent  a  pool  out  of  which  terms  are  con- 
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structed.  TTie  sole  purpose  of  a  ground,  ie  variable-free,  term  is  to  serve  as  a  name  for  a  particular 
entity  of  the  real-world  section.  The  set  of  ground  terms  thereby  comprises  the  names  that  can 
be  assigned  to  these  entities.  A  predicate  symbol  is  used  to  express  a  simple  statement  on  one  or 
more  entities’  property  or  relationship.  A  ground  atomic  formula  is  a  concrete  statement  on  the 
entities  named  by  the  formula’s  terms.®  Its  truth  value  is  determined  by  the  database’s  intended 
model.  Speaking  in  a  more  illustrative  manner,  the  set  of  predicate  symbols  corresponds  to  the 
names  of  all  (base  and  derived)  relations,  the  set  of  ground  terms  to  possible  attribute  values  and 
the  set  of  ground  atomic  formulae  to  possible  tuples  in  the  relations’  extension. 

This  situation  represents  a  flat  name  space.  To  refer  to  an  entity  one  just  has  to  know  its  name,  ie 
the  ground  term  that  has  been  assigned  to  it;  similarly,  each  property  can  be  referred  to  by  its 
predicate  symbol.  Since  both  the  ground  terms  and  the  predicate  symbols  are  sets,  there  can 
never  be  distinct  entities  or  distinct  properties  that  go  by  the  same  name. 

Any  serious  database  product  can  deny  access  to  a  relation  to  a  user.  In  response  to  a  command 
that  refers  to  such  a  relation  the  database  rejects  the  command,  usually  with  the  explanation  that 
this  relation  is  unknown  or  undefined.  This  behaviour  is  perfectly  in  accord  with  the  theoretical 
side.  We  have  created  an  individual  signature  for  a  user.  Its  set  of  predicate  symbols  is  a  subset  of 
the  open  database’s  predicate  symbols  -  there  is  no  change  in  the  function  symbols.  But,  at 
present,  it  is  not  possible  to  restrict  the  function  symbols  for  a  user  nor  to  give  him  a  set  of  private 
function  symbols.  With  respect  to  both  deficits,  there  is  no  way  a  database  will  reject  an  update 
command  on  the  grounds  that  a  ground  term  is  unknown  or  undefined.  And  a  select  command 
will  always  return  an  answer-set,  even  if  it  is  empty. 

Spalka/Cremers  (1996)  introduced  a  secure  logic-based  database,  in  which  objects  of  confidenti¬ 
ality  are  ground  atomic  formulae.  The  authors  defined  the  formal  meaning  of  a  confidentiality 
demand  by  characterising  the  difference  between  the  open  database’s  intended  model  and  one 
that  can  be  said  to  satisfy  the  demand.  This  approach  retains  the  original  open  database,  which 
corresponds  to  the  intended  state  of  affairs,  and  yields  a  set  of  deliberately  falsified  individual 
databases.  It  is  essential  to  realise  that  this  step  has  not  created  a  group  of  new  alien  databases. 
They  all  still  refer  to  the  same  real-world  section:  the  original  open  database  captures  its  intended 
image,  and  an  individual,  falsified  one  presents  a  distorted  image  of  it. 

In  this  work  we  extend  that  approach  and  impose  the  structure  of  a  global  universal  name-space 
on  the  sets  of  terms  and  predicate  symbols.  The  local  subspaces  are  those  of  open  databases.  The 
choice  and  the  handling  of  the  name-space  selectors  is  the  part  relevant  to  security.  We  show, 
firstly,  that  the  most  general  case  of  sharing  and  hiding,  ie  when  a  user  can  share  an  entity  with 
any  other  group  of  users  or  keep  it  secret  from  them,  corresponds  to  choosing  the  users’  power 
set  as  the  set  of  name-space  selectors.  And,  secondly,  the  identification  of  security  levels  with 
name-space  selectors  yields  a  canonical  interpretation  of  confidentiality  demands  stated  for  terms 
and  predicate  symbols.  These  extensions  preserve  all  semantic  properties  of  open  databases  and 
all  security  properties  of  the  database  of  Spalka/ Cremers  (1996). 

The  subsequent  section  introduces  the  fundamentals  of  open  logic-based  databases.  Section  3 
deals  with  previous  works  in  this  field.  Beginning  with  some  prominent  algebraic  and  logical 
approaches,  it  presents  a  concise  yet  complete  definition  of  the  database  of  Spalka/Cremers 


2.  More  complex  statements  can  be  expressed  by  forming  more  complex  formulae. 
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(1996).  Section  4  comprises  the  main  results  of  this  work.  Their  application  is  illustrated  with  two 
examples  in  section  5.  Lastly,  the  conclusion  summarises  the  main  results  of  this  work  and 
presents  a  brief  outlook. 

2  Logic-based  databases 

An  alphabet  21  is  a  non-empty  and  finite  set  of  symbols,  such  that  any  string  of  2l’s  elements  has  a 
unique  decomposition  into  2fs  elements.  21*  denotes  the  set  of  strings  of  finite  length  of  elements 
of  21.  Let  F  =  21*  be  the  set  of  function  symbols,  P  c  21*  a  finite  set  of  predicate  symbols,  and 
and  p  :P-»Nothe  functions  determining  the  arity  of  the  symbols.  Then 
2  =  (21,  P,p)  is  a  database  signature.  Let  Fbe  an  infinite  set  of  variables.  Then  T(2)  =  F\JV 
is  the  set  of  terms  over  the  signature  2.  7^(2)  =  F  is  the  subset  of  ground  terms.  The  set  of 
atomic  formulae  over  2  is  A(2)  =  ...,  t„)\pe  P,p(p)  =  n,  ti  G  7(2),  f  =  1, ...,  n}  and 

ri(p(2)  is  the  subset  of  ground  atomic  formulae.  With  a  G  ri(2) ,  a  is  a  positive  literal  and  a 
negative  literal.  The  first  order  language  over  the  signature  2  is  the  smallest  set  L(2)  with  the 
following  properties:  A(2)  c  1(2) ;  if  V'  e  F(2) ,  then  {<p)  v  (xp)  &  L(2) ;  if  ^  G  L(2) ,  then 
“■(^p)  G  F(2) ;  if  ^  G  L(2)  and  XgV,  then  VAT :  <p  G  Z,(2) .  Lo(2)  is  the  subset  of  closed  formu¬ 
lae,  viz,  all  variables  of  a  ^  G  Lo(2)  are  quantified.  A  normal  clause  is  a  closed  formula 
a-e-Aj  A  ...  in  which  all  variables  are  assumed  to  be  universally  quantified,  a  G  A(2)  is  the 
clause’s  head  and  A  ...  A  A„ ,  a  conjunction  of  literals,  its  body.  A  clause  is  range-restricted  if 
any  variable  that  occurs  in  the  clause  occurs  also  in  a  positive  literal  in  its  body.  C^(2)  is  the  set 
of  range-restricted  clauses  over  2. 

Let  D  and  Q  be  sets  such  that  there  is  a  G  Q ,  cop-.F-^D  and  for  each  P&P,  p(p)  =  n , 
there  is  a  G  Q ,  :  Z)”  ->  {True,  False} .  Then  M(2)  =  {D,  Q)  is  an  interpretation  for  the 

signature  2.  Alternative  notations  for  restrictions  to  M(2)  =  (D,Q):  for  ground  atomic  formu¬ 
lae  :  A(2)  -5>  {True,  False} ;  for  closed  formulae:  Mo(2) :  Lq(E)  {True,  False} .  (Some 

works  define  a  model  as  M^\2)(True)  £^^(2)).  M(2)  is  a  model  of  <1>,  Af(2)  G  Mod(O), 
O  c  Lo(2) ,  if  G  3> :  Mq(2)(<p)  =  True .  M(2)  =  (D,  Q)  is  a  Herbrand-interpretation  or  a 
Herbrand-model  if  Z)  =  Tq(E)  and  cop  =  id .  c  Z-^fE)  is  a  logical  consequence  of  <[>  c  Lq(2)  , 
O  1=  'F ,  if  Mod(O)  c  ModfW) . 

We  assume  that  the  intended  semantics  of  a  set  $  c  Lo(2)  is  defined  by  its  completion®  with 
respect  to  2,  comp(0, 2) . 

Let  2  be  a  database  signature,  C  c  Lq(2)  ,  Mod(C)  ^  0,  a  finite  set  of  closed  formulae  over  2  and 
Z  c  C^(2)  a  finite  set  of  safe  clauses  over  2  such  that  comp(Z,  2)  i=  C.  Then  D  =  (2,  C,  Z)  is  a 
logic-based  database  with  completion  semantics.  2  defines  the  database  language,  C  is  the  set  of 
integrity  constraints,  Z  is  a  valid  present  state  and  the  intended  semantics  of  Z  is  defined  by  the 
unique  Herbrand-model  M(E)  =  (D,Q)  of  comp(Z,  2) . 

A  transaction  r  -  in  a  declarative  notion  -  is  a  partitioned  set  t  =  <5  U  i ,  r  c  Aq(I,)  .  The  applica¬ 
tion  of  r  to  M(Z)  yields  an  interpretation  M'(2),  such  that:  M(E){d)  =  True  and 
M'(2)(d)  =  False ,  M(2)(0  =  False  and  M'(Z){i)  =  True ,  and  M(E)(a)  =  M'(2)(a)  for  any 
a  G  A(p(2)\t  .  If  M'(2)  G  Mod(Q ,  then  r  is  accepted  and  M'(2)  becomes  the  present  model  of 
D\  otherwise  r  is  rejected,  r  is  a  singleton-update  if  |t|  =  1 . 


3.  Cf,  eg,  Das  (1992)  or  Cremers/Griefahn/Hinze  (1994). 
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3  Previous  works 

3.1  Algebraic  approaches 

Most  algebraic  approaches  are  based  on  the  multi-level  relational  data  model  introduced  by  Den¬ 
ning  et  al  (1987).  This  model  does  not  mention  name-spaces  explicitly  and  a  formal  definition  of 
its  signature  yields  only  a  flat  name  space.  However,  the  model  leaves  the  impression  that  the 
assignment  of  access  classes  to  the  elements  of  a  tuple  also  intends  to  structure  the  name  space. 

Various  problems  of  this  model  have  already  been  observed  by  its  inventors.  Those  related  to  the 
name  space  in  particular  have  been  aptly  stated  by  Gajnak  (1988).  The  author  investigates  the 
adaptability  of  the  entity-relationship  modelling  to  multi-level  security  requirements  and  identi¬ 
fies  three  fundamental  principles  of  multi-level  databases  which  must  not  be  violated  The 
important  semantic  determinacy  principle  states  that  ‘...  factual  dependencies  should  be  non- 
ambiguous’®.  This  property  is  violated  by  SeaView’s  treatment  of  polyinstantiation  and  the 
author  gives  an  example  in  which  poljdnstantiation  can  mean  that:  one  database  entry  is  an  alias 
for  another  one,  a  secret  entry  has  been  leaked  or  the  two  entries  refer  to  two  real  world  objects. 
He  concludes  aptly  that  in  this  situation  referential  integrity  as  such  must  be  ambiguous.  Regret¬ 
tably,  the  author’s  final  advice  -  which  we  strongly  support  -  that  ‘...  the  determinacy  principle 
should  be  supported  directly  by  multi-level  secure  data  models’ has  been  given  little  attention 
in  the  following  years. 

Tbough  implicitly,  Sandhu/Jajodia  (1992a)  also  deal  with  naming  conflicts  in  SeaView.  The  work 
proposes  to  use  polyinstantiation  for  cover  stories  only.^^  For  the  implementation  of  a  cover  story 
the  authors  use  the  special  value,  ‘restricted’,  which  was  introduced  in  Sandhu/Jajodia  (1990) 
and  Sandhu/Jajodia  (1992b).  The  authors  demand  that  primary  keys  must  not  be  polyinstanti- 
ated,  ie,  a  primary  key  is  unique  regardless  of  its  access  class.  Three  ways  to  achieve  this  are 
suggested:  ‘Make  all  keys  visible’^^\  ‘Partition  the  domain  of  the  primary  ke/^®^  and  ‘limit  inser¬ 
tions  to  be  done  by  trusted  subjects  only’^^®^  The  first  suggestion  says  that  a  flat  name  space  is 
fine  if  confidentiality  demands  for  function  symbols  are  not  admitted.  The  second  one  would  vio¬ 
late  the  name  space’s  property  of  being  a  universal  one.  And  the  third  one  shifts  the  problem  of 
name  spaces  to  that  of  update  rights. 

3.2  Logical  approaches 

Among  the  first  works  which  -  although  implicitly  -  consider  non-uniform  name-spaces  are 
Thuraisingham  (1991)  and  Thuraisingham  (1992).  Based  on  the  conviction  that  standard  logic  is 
inadequate,  the  author  attempts  to  formalise  the  rules  and  properties  of  mandatory  access  con¬ 
trol  models  in  NTML,  a  non-monotonic  logic. 

Although  NTML  has  been  shown  to  be  not  sound^^^\  Garvey  et  al  (1992)  present  a  similar  idea 


4.  Gajnak  (1988)  :189. 

5.  Gajnak  (1988)  :183. 

6.  Gajnak  (1988)  :189. 

7.  We  note  that  this  is  an  assumption  contrary  to  the  one  made  in,  eg,  Sandhu/Jajodia  (1991) . 

8.  Sandhu/Jajodia  (1992a)  :315. 

9.  Sandhu/Jajodia  (1992a)  :315. 

10.  Sandhu/Jajodia  (1992a)  :316. 

11.  Garvey  et  al  (1992) :  160. 
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of  hiding  function  symbols.  This  work  introduces  a  multi-level  database  as  a  collection  of  data¬ 
bases.  The  data  of  each  database  is  a  theory  of  the  function-free  subset  of  first-order  logic.  An 
access  class  can  be  assigned  to  a  function  s5mibol,  a  predicate  symbol  or  a  whole  fact.  In  the  first 
two  cases,  the  structure  of  access  classes  induces  an  analogous  structure  of  first-order  lan¬ 
guages.  However,  this  does  not  hold  for  the  third  case  if  the  database  is  polyinstantiated.  The 
main  problem  of  this  view  is  that  the  idea  to  keep  a  function  symbol  secret  is  not  examined  with 
respect  to  its  compatibility  with  the  usual  database  modelling  approach,  which  assumes  a  univer¬ 
sal  name  space.  Promising  though  the  authors’  observations  are,  until  today  the  approach  has  not 
been  further  developed. 

3.3  Secure  logic-based  databases  with  a  common  name-space 

Spalka/Cremers  (1996)  present  an  axiomatic  interpretation  of  security-level-based  mandatory 
security  policies  in  logic-based  databases  that  establishes  database  properties  as  proofs  from  only 
a  few  assumptions.  Since  confidentiality  demands  can  be  stated  only  for  ground  atomic  formulae, 
the  authors  define  a  secure  logic-based  databases  with  a  common  name-space.  One  of  its  impor¬ 
tant  properties  is  the  independence  of  the  semantics  from  any  particular  confidentiality  demands, 
ie,  the  addition  or  removal  of  confidentiality  demands  does  not  affect  the  semantics  of  the  data 
nor  the  notion  of  integrity.  The  database  of  Spalka/ Cremers  (1996)  is  defined  as  follows. 

MCP  =  {U,  O,  (S,  <),  G)  is  an  instance  of  a  security-level-based  mandatory  confidentiality  pol¬ 
icy  such  that:  f/  is  a  set  of  individuals;  O  is  a  set  of  protection  units;  S  is  a  set  of  security  levels  on 
which  a  partial  order  “<’  is  defined;  G:  UU  O  ^  S  is  a  labelling-function  that  assigns  a  security 
level  to  each  individual  and  protection  unit;  and  with  respect  to  the  legal  obligation  that  The 
dissemination  of  iifformation  of  a  particular  security  level  (including  sensitivity  level  and  any 
compartments  or  caveats)  to  individuals  lacking  the  appropriate  clearances  for  that  level  is  pro¬ 
hibited  by  law  the  following  Primitive  Mandatory  Requirement  is  satisfied:  o  G  0  should  be 
kept  secret  from  uG  U  if  G(o)  <  G{u)  does  not  hold. 

Let  D  =  (2,  C,  /)  be  an  open  database  with  the  intended  model  M(2) .  For  each  sGS  there  is 
a  database  =  (2^,  Q,  I^)  with  a  unique  intended  Herbrand-model  M*(2j)  for  2^ .  M(2)  is  the 
image  of  the  open  world-section  and  any  (2^)  is  a  distortion  thereof.  The  distortion  of  M*(2^) 
with  respect  to  M(2)  is  recorded  in  its  distortion-log  =  (A,,  X^,  ZJ .  is  a  partition  of  Ag(2) 
such  that:  Va  e  A, :  M^(2,)(a)  =  M^(2)(a) ,  'iaGX,\  M^(2,)(a)  =  and 

Va  £  Zj :  a  ^  Ag(2^) .  The  common  name-space  is  due  to  the  assumption  that  2^  =  2  for  all 
sGS,ieP,  =  (A„Z„0). 

A  unit  of  protection  can  be  any  ground  atomic  formula  a  G  Aq(E)  .  The  intended  meaning  of  the 
primitive  mandatory  requirement  for  ground  atomic  formulae  is  as  follows.  Let  «  £  Ag(2)  and 
s  £  S .  If  G(a)  <  s  does  not  hold  and  a  G  Ac(2j) ,  then  Af^(2)(«)  =  -■Af4(2)(a) ,  ie  a  £  . 

A  user  uGU  can  query  and  update  any  database  with  G(m)  >  s  .  ^  =  <3  U  i ,  r  c  A(;(2^) , 


12.  At  this  point,  however,  the  authors  do  not  state  how  to  find  the  intended  model  of  such  a  theory. 
They  realise  that  a  straight  application  of  the  Closed  World  Assumption  may  lead  to  a  contradic¬ 
tion.  To  omit  this  problem,  they  propose  to  view  the  theory  as  a  set  of  beliefs,  yet  without  speci¬ 
fying  the  intended  model. 

13.  Garvey  etal  (1992)  :160. 

14.  Landwehr  (1981)  :249. 
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is  a  transaction  submitted  by  u  and  applied  to  ,  viz,  to  . 

The  powers  of  «  G  C/  are  expressed  as  update-rights  Rg{u)  ^k,s  rejected  if 

s  ^G{u)  s-  Congruent  with  the  meaning  of  powers,  the  assumption  Observance  of  Pow¬ 
ers  states  that  the  database  is  not  allowed  to  ignore  a  transaction  a  user  is  entitled  to  execute,  ie 
■^G(m)  -  ^G(«)  •  This  -  presumably  common-sense  assumption  -  has  two  important  implications. 
The  first  one  is  the  Update  Truthfulness  lemma.  Suppose  that  the  application  of  the  transaction 
^,s  =  G{u) ,  to  Af (2^)  yields  (2^) .  If  ,  is  accepted  by  ,  then  it js  also  propagated  to 

D,  ie,  it  is  also  applied  to  D  and  the  following  condition  holds:  Va  G  ^ :  fU*(2j)  =  M(2) .  The 
second  one  is  the  Subordinate  Validity  lemma.  If  t„  ,  is  accepted  by  D, ,  then  it  is  also  accepted 
byD. 

Lastly,  the  assumption  is  made  that  the  database  need  not  be  a  user’s  only  source  of  information 
on  the  real-world  section.  For  a  user  m  G  U  the  set  c  .4^(2^,) ,  s  =  G{u) ,  expresses  m’s 
assumed  special  knowledge  on  the  real-world  section.  Informally,  comprises  those  facts  the 
truth  value  of  which  u  can  inspect  without  consulting  the  database.  Formally,  it  is  the  condition 
that  \/aG  Kg :  M^(2j)(o:)  =  M^(2)(a) .  An  immediate  consequence  of  it  is  the  Observance  of 
Knowledge  lemma;  c  for  any  uG  U. 

The  satisfaction  of  confidentiality  demands  is  not  unconditional.  TTie  database  is  required  to 
ascertain  beforehand  the  circumstances  of  several  events.  Just  as  any  transaction  will  be  rejected 
if  integrity  is  violated,  a  confidentiality  demand  will  be  rejected  if  any  of  the  above-defined  invari¬ 
ants  is  violated.  The  success  depends  on  the  ability  to  find  a  distortion  of  the  affected  models  that 
respects  these  invariants.  Given  a  particular  confidentiality  demand,  the  enforcement  method 
used  by  Spalka/Cremers  (1996)  relies  on  aliases,  ie,  on  the  additional  reversal  of  one  or  more 
facts’  truth  values. 

4  Structured  and  hierarchical  name-spaces 

In  an  open  logic-based  database  the  signature  2  =  (21,  F,  p)  defines  a  flat  name-space.  It  is  not 
possible  to  remove  from  it  nor  to  add  to  it  single  ground  terms  without  destroying  its  universal 
name-space  property.  Our  approach  preserves  this  property  on  a  local  basis  by  replacing  the  flat 
name-space  with  a  global  name-space  that  comprises  several  local  universal  name-spaces. 

As  outlined  in  the  introduction,  an  element  of  the  global  name-space  is  a  tuple  (a,  b)  such  that  a 
designates  a  local  name-space  and  b  is  an  element  of  the  local  name-space.  The  formal  extension 
is  straightforward.  Given  a  universal  name-space  21*  and  a  set  E,  we  define  N  =  F  x  21*  to  be  a 
global  universal  name-space.  F  is  a  prefix-set  which  comprises  the  names  (or  selectors)  of  the 
local  name-spaces,  and  (a,  6)  G  £  x  21*  . 

The  usefulness  of  this  extension  with  respect  to  a  particular  application  depends  on  a  suitable 
choice  of  the  prefix-set  and  on  the  allocation  of  name-space  selectors  to  the  application’s  users. 
We  first  take  a  look  at  the  most  general  situation  -  it  can  be  subsequently  tailored  to  any  special 
requirements  concerning  the  sharing  and  the  confidentiality  of  the  global  name-space’s  ele¬ 
ments. 

Let  t/be  a  set  of  users  and  21*  a  universal  name-space.  Firstly,  set  £  =  CO,  the  power-set  of  [/. 
Then  for  each  eG  E  =  {  e}  x  21*  represents  a  local  universal  name-space.  Secondly,  assign 
to  each  user  MG(7aset£„c{eG£|  uGe}.  And,  lastly,  define  u’s  name-space,  N„ ,  as 
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K=  U 

eeE„ 


N=  \JN„ 

u&U 

is  the  application’s  global  universal  name-space. 

There  are  two  main  reasons  for  this  definition. 

The  first  one  is  the  possibility  of  using  the  name-space  selectors  for  the  expression  of  both  the 
separation  and  the  sharing  of  name-spaces  among  users.  Let  e  E.E.  Then  the  entities  designated 
by  the  elements  of  =  {g}  x  21*  are  supposed  to  be  shared  among  the  users  uE  e  and  kept 
secret  from  the  users  u  E  U\e.  For  example,  if  the  user  u  E  U  should  have  an  exclusive  private 
local  name-space,  then  we  add  the  selector  {u}  E  E  to  his  set  of  name-space  selectors  E^ .  If 
there  should  be  a  name-space  shared  by  the  group  of  users  u^, ...,  ,  then  we  add  the  selector 

{u^,  E  E  to  all  E^^, E^^ .  Setting  E^  =  {bEE  \  uEe}  gives  the  maximum  flexibil¬ 

ity.  Here,  each  user  has  a  private  name-space  and  shares  a  name-space  with  any  other  group  of 
users. 

The  second  reason  is  the  canonical  partial  order  defined  by  the  inclusion  relation  on  a  power  set. 
It  subsumes  any  partial  order  on  the  set’s  elements  in  the  following  sense  Let  (S,  <)  be  a 
partial  order.  Then  there  is  a  unique  subset  £  c  ^(S)  such  that  (S,  <)  and  {E,  2)  are  isomor¬ 
phic,  ie,  there  is  a  bijective  function  h  :  S^E  such  that  <  53  3  h{s^ ,  s^,S2ES,  and 

5  £  h(s) . 

On  these  grounds  we  can  express  the  extension  to  the  database  defined  in  section  3.3  in  a  very 
succinct  way.  Let  MCP  =  (U,  O,  (S,  <),  G)  be  the  given  instance  of  a  security-level-based  man¬ 
datory  confidentiality  policy.  The  main  step  is  to  find  the  partial  order  (E,  2)  isomorphic  to 
(S,  <) .  Then  define  the  open  database  D  =  (2,  C,  /) ,  2  =  (N,  P, p)  with  the  global  universal 
name-space  N  =  E  xQl* ,  Pc  N  and  p  :  P -»■  Nq .  With  every  security  level  s  E  S  there  is  an 
associated  database  =  (2^,  C^,  I^) .  The  name-space  of  is  a  collection  of  local  universal 
name-spaces  within  N.  Set,  as  shown  above,  E^  =  {eEE\sEe}.  Then  set  2,  =  Ps,Ps) 
with  iV,  =  (J  AT,,  P,  =  {A  £  P  I  A  e  N,}  andp,  =  p\  . 

We  can  eliminate  {E,  2)  in  a  simple  transformation.  The  result  is  a  definition  that  refers  only  to 
the  security  levels  S.  For  the  function  h  :  S-^  E  is  bijective,  we  can  define  D  =  (2,  C, 7)  and 
2  =  {N,P,p)  with  N  =  S  xQl* ,  Pc  N  and  p  :  P  ^  Nq  .  In  the  next  step,  we  replace  the  inclu¬ 
sion  with  the  partial  order  on  S.  Let  =  {s'  £  S  |  s'  <  5} .  Then  set  x  21* , 

Ps  =  (A  G  P  I  A  e  WJ  and  ^  ^  Ip/  define  2,  =  (N^,  P^,  p^)  and  =  (2^,  C^,  I^) . 

Lastly,  we  define  the  security  semantics  of  the  extension,  ie,  the  meaning  of  a  confidentiality 
demand  stated  for  ground  terms  or  predicate  symbols.  A  unit  of  protection  can  be  any  ground 
term  t  E  Tq(L)  and  any  predicate  symbol  p  E  P.  The  intended  meaning  of  the  primitive  manda¬ 
tory  requirement  for  ground  terms  and  predicate  symbols  is  as  follows. 

Let  w  =  {a,  b)  be  a  ground  term  or  a  predicate  symbol  and  s  £  S .  If  G{w)  <  s  does  not  hold. 


15.  Cf  Davey/Priestley. 
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then  w  ^  Tq(J1s)  and  w^P^fiewGZ^. 

We  immediately  see  that  a  necessary  condition  for  the  satisfaction  of  a  confidentiality  demand  for 
w  =  {a,  b)  is  a  >  G(u>) . 

To  conclude  the  theoretical  part,  we  note  that  in  our  database  -  like  in  every  logic-based  database 
with  a  unique  intended  model  -  there  are  no  NULLrValues.  There  are  good  reasons  for  their 
exclusion.  Speaking  in  a  colloquial  manner,  NULLrvalues  make  the  semantics  of  even  open 
databases  messy.  For  example,  just  by  looking  at  a  database  there  is  no  way  to  tell  if  a  NULL-value 
means  that  a  value  is  unknown  or  if  the  attribute  is  not  applicable  here.  Although  we  have  not 
investigated  it  formally,  the  corresponding  works  on  open  databases  indicate  that  their  use  for 
confidentiality  purposes  will  not  make  the  situation  any  less  confusing. 

Let  us  briefly  restate  the  semantic  properties  of  this  database.  The  open  database  D  =  (2,  C,  I) 
and  its  intended  model  Af(2)  capture  at  all  times  the  image  of  the  open  real-world  section.  If  there 
is  a  user  fi-om  whom  nothing  is  kept  secret,  then  this  is  his  database.  The  database  at  a  security 
level  s  E  S ,  =  (2^,  C^,  Q ,  and  its  intended  model  M*(2j)  capture  an  image  of  the  same  open 

real-world  section  as  D  =  (2,  C,  I) .  Due  to  confidentiality  demands,  this  image  can  be  distorted. 
Speaking  somewhat  loosely,  the  distortion-log  =  (A^,  ZJ  keeps  track  of  the  -  deliberately 
introduced  -  lies,  ,  and  the  -  also  deliberately  withdrawn  -  missing  information,  Z, .  A  query 
submitted  to  is  always  evaluated  with  respect  to  Af*(2^) .  This  means  that  no  trusted  user  is 
ever  confused we  do  not  have  to  guess  about  versions,  degrees  of  interest  or  recency^^^\  nor 
do  we  have  to  deal  with  imprecise  beliefs^^®^  The  acceptance  of  update  operations,  which  can  be 
combined  with  the  specification  of  confidentiality  demands,  is  subject  not  only  to  the  integrity 
constraints  but  also  to  the  powers  of  the  issuing  user,  the  Observance  of  Powers  assumption  and 
the  Observance  of  Knowledge  lemma.  They  all  must  always  be  satisfied.  Aliases  are  additional 
distortions  a  user  can  introduce  to  support  his  confidentiality  demand. 

5  Two  modelling  examples 

5.1  A  small  transport  company 

Let  us  illustrate  the  application  of  our  theory  in  an  example  of  a  of  a  company  that  transports 
goods  in  cars. 

Suppose  that  the  alphabet  21  comprises  the  letters  of  the  English  alphabet  and  the  digits  0 
through  9,  and  that  there  are  four  security  levels  L  =  1 2, 1^,  ^4}  such  that  >  /g  >  ^4  and 

l^>  l2>  The  company  represents  its  data  in  three  relations:  (Z4,  base) ,  (Z4,  cargo)  and 
(Z2,  value) .  The  first  one,  (Z4,  base) ,  stores  the  company’s  cars;  (l^,  cargo)  stores  the  assign¬ 
ment  of  cargoes  to  cars;  and  (Z2,  value)  tells  us  a  cargo’s  value  in  USD.  Tke  choice  of  Z4 ,  the 
lowest  security  level,  as  the  name-space  selector  for  (Z4,  base)  and  (Z4,  cargo)  reflects  the  fact 
that  every  user  knows,  or  needs  to  know,  that  the  company  keeps  a  record  of  its  cars  and  the 
cargo  assignments.  The  selector  I2  tells  us  that  only  users  with  a  clearance  of  Z4  or  I2  are  sup- 


16.  A  detailed  discussion  of  this  issue  and  an  approach  to  making  NUIT-values  precise  can  be  found, 
eg,  in  Reiter  (1984). 

17.  Cf  Wiseman  (1989). 

18.  CfDenningetal  (1987). 

19.  CfSmith/Winslett(1992). 
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posed  to  know  that  the  cargoes’  value  is  also  stored.  Formally: 

•  Pi^  =  Pi^  =  {{I2,  value),  {l^,hase),{l^,  cargo)} 

•  ^  ^  {ih,base),{l^,  cargo)} 

At  the  moment,  the  company  owns  two  cars.  The  first  one  is  parked  in  front  of  the  main  entrance 
and  is  known  to  everybody,  thus,  we  name  it  (^4,  dog) .  The  local  part  of  the  name,  dog,  is  arbi¬ 
trary.  The  feet  that  this  car  is  shared  among  all  is  reflected  in  its  selector  .  The  second  car  is 
hidden  in  a  secret  garage  of  which  only  users  cleared  at  /j  know  -  we  name  it  (Z^,  cat) .  Thus,  the 
relation  {l^,  base)  comprises  two  tuples:  (Z4,  base){{l^,  dog))  and  (Z4,  base){{l-^,  cat)) .  Formally, 
{l^,  base)((l^,  dog))  is  a  ground  atomic  formula  of  every  database’s  language:  Ag(2)  and 
, ...,  AqCEi^  ;  (l^,  base)((l-^,  cat))  belongs,  of  course,  to  Aq{I.)  and  only  to  . 

Now  suppose  that  a  new  car  is  bought,  which,  placed  in  a  different  secret  garage,  should  only  be 
available  to  users  with  a  clearance  of  /g  and  .  By  chance,  a  user  at  /g  decides  to  name  it  cat. 
Congruent  with  the  confidentiality  demand  and  without  any  naming  conflicts,  he  can  enter 
(/g,  cat)  into  {l^,  base)  Formally,  (Z4,  base)((l2,  cat))  is  only  an  element  of  ,  Aq(Ei  ) , 
and  AefS/j) . 

In  the  end,  there  is  no  doubt  about  the  number  of  owned  cars,  who  shares  which  car  and  which 
car  is  kept  secret  fi-om  whom. 

Let  us  now  briefly  list  some  more  combinations: 

•  (l^,  cargo){(l^,  dog),  (l^food)) ,  an  element  of  all  languages,  is  a  transport  of  a  cargo  in  a 
car  known  to  everybody;  requires  no  aliases 

•  (l^,  cargo)i(l2,  cat),  (/g,  liquor)) ,  an  element  of  only  Aq(Z)  and  Aq(Ei^)  ,  is  a  transport 
known  only  to  I ^  in  a  car  known  to  /j  and  /g  of  a  cargo  known  only  to  /j  and  /g ;  may 
require  aliases  in  and 

•  (/g,  value){{l^,food),  (J,^,  1000)),  an  element  of  Aq(Z),  A^fS;^)  and  Aq(Zi^,  is  a  corre¬ 
spondence  known  only  to  and  /g  of  a  public  cargo  and  a  public  amount;  requires  no 
aliases 

•  (/g,  value){{l2,  liquor),  (Z^,  2000)),  an  element  of  only  A^fS)  and  Aq(Zi^,  is  a  corre¬ 
spondence  known  only  to  Z4  of  a  cargo  known  to  Z4  and  Zg  and  an  amount  known  only  to 
Z4 ;  may  require  aliases  in 

5.2  The  Bell/La  Padula  access  control  model 

The  work  of  Bell/La  Padula  (1975)  has  adapted  mandatory  controls  used  in  a  paper-based  envi¬ 
ronment  to  operating  systems.  Their  mandatory  access  control  system,  also  called  the  Bell/La 
Padula  model,  is  mainly  remembered  for  its  two  access  control  rules,  the  Simple-Security-Prop¬ 
erty  and  the  *-Property.  Operating  systems  adhering  to  this  model  possess  a  number  of  excellent 
security  properties  -  the  resistance  against  a  broad  range  of  untrustworthy  programs  is  among 
the  ones  most  often  mentioned. 

The  model’s  handling  of  name-spaces  has  received  little  attention.  Admittedly,  the  authors  do  not 
deal  with  this  issue  explicitly.  Yet  a  user  can  name  a  new  object  any  way  he  wants  and  a  confiden- 


20.  Since  this  is  a  unary  relation  we  do  not  need  aliases  here. 
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tiality  violating  naming  conflict  will  never  occur.  The  simple  reason  for  this  is  that  each  security 
level  has  its  own  local  universal  name  space  and  the  level’s  name  serves  as  a  selector  for  a  name 
space.  The  set  of  selectors  available  to  a  user  is  determined  by  his  clearance  and  comprises  all 
security  levels  dominated  by  it. 

In  this  example  we  model  Bell/la  Padula’s  operating  system  concepts  in  our  database. 

Let  MCP  =  (U,0,(S,<),  G)  be  an  instance  of  a  security-level-based  mandatory  confidentiality 
policy.  Here,  U  are  subjects.  An  object  o  =  ((I,  v),  c)  e  0  has  a  structured  name  (I,  v)  and  a 
content  c.  y,  c  E  21* ,  /  E  S  and  21  is  the  system’s  alphabet,  eg,  a  subset  of  the  ASCII  set.  Define 
the  labelling  function  G:  f/U  0  ^  S  such  that  G(o)  =  1. 

TLie  corresponding  database  is  quite  simple.  Define  D  —  (2,  C,  7)  and  2  =  {N,P,p)  with 
iV  =  S  X  21* ,  P  =  {iz,r),  (z,  eq)}  and  p((z,  r))  =  p((z,  eq))  =  2 .  Here  we  must  assume  that 
there  exists  a  smallest  element  zElS  with  respect  to  the  partial  order  on  S.  The  relation  {z,  eq) , 
the  equality,  is  needed  to  express  integrity  constraints,  (z,  r)  stores  the  objects  in  the  following 
way.  An  object  o  =  ((/,  v),  c)  is  represented  as  the  tuple  (z,  r)((l,  v),  (z,  c)) .  This  corresponds  to 
the  container-orientated  view  on  protection:  (/,  v)  represents  both  the  container’s  name  and  pro¬ 
tection  requirements  and  (z,  c)  the  container’s  content  which  is  also  shielded  by  the  container’s 
protection.  Formally,  (z,r)((l,v),  (z,  c))  is  only  a  ground  atomic  formula  ofthe  language  of  D,.  if 
r  >  1.  To  ensure  the  uniqueness  of  the  containers’s  names,  we  define  the  first  attribute  of  (z,  r) 
to  be  the  relation’s  primary  key. 

The  observance  of  the  Simple-Security-Property  is  already  guaranteed  by  the  definition  of  our 
database.  A  subject  u  can  only  query  the  database  D;  if  G{u)  >  Z.  To  enforce  the  *-Property  we 
can  restrict  the  update  commands.  Let  ;  =  d  U  i  be  a  transaction  submitted  by  u  and  applied 
to  Di .  Then  G{u)  >  /  and  ^  has  one  of  the  following  forms: 

•  INSERT  INTO  (z,  r)  VALUES  ((Z,  <local  name  v>),  (z,  <content  c>)) 

•  DELETE  FROM  (z,  r)  WHERE  <any  condition>  AND  name  =  (Z,  *) 

•  UPDATE  (z,  r)  SET  content  =  (z,  <content  c>)  WHERE  <any  condition>  AND  name  =  (Z,  *) 
Note  that  it  is  these  restrictions  that  obviate  the  need  for  aliases  in  this  example  of  a  database. 

6  Conclusion 

TTie  theory  presented  here  addresses  the  semantics  and  the  enforcement  of  confidentiality 
demands  in  databases  at  the  level  of  function  symbols  and  predicate  symbols.  Such  a  confidenti¬ 
ality  demand  refers  to  a  symbol’s  membership  in  a  database’s  language.  The  theory  preserves 
the  syntax  and  semantics  of  standard  open  databases.  There  is  no  ambiguity  with  respect  to  the 
entities  and  relations  of  the  real-world  section  of  an  application.  And  there  is  no  ambiguity  with 
respect  to  the  evaluation  of  queries  and  integrity  constraints  -  in  an  equijoin  the  unique  names 
within  the  joined  attribute  will  always  compute  the  standard  result  and  a  foreign-key  constraint 
will  always  unequivocally  point  to  the  corresponding  tuples. 
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Abstract 

This  paper  introduces  guidelines  aiming  at  the  pre¬ 
vention  of  illegal  information  flows  due  to  object  dele¬ 
tion  in  multilevel  secure  object  database  management 
systems  (ODBMSs).  Although  a  delete  operation  can 
be  viewed  as  a  kind  of  write  operation,  this  does  not 
suffice  to  avoid  covert  channels.  Hence,  the  attention 
is  focused  on  delete  operation  and  its  implications  on 
database  security.  The  guidelines  we  propose  are  for¬ 
mally  stated  as  security  principles.  We  also  show  how 
to  design  a  garbage  collection  mechanism  in  a  multi¬ 
level  secure  ODBMS.  The  garbage  collection  ensures 
both  security  and  referential  integrity, 

1  Introduction 

Object-oriented  database  management  systems  and 
recent  object-relational  database  management  sys¬ 
tems  (in  what  follows  we  will  refer  to  both  kind 
of  systems  as  object  database  management  systems 
-  ODBMSs  for  short)  continue  to  be  an  active  re¬ 
search  area  for  both  the  academic  and  the  industrial 
world.  Issues  related  to  security  and  privacy  have 
been  investigated  in  the  area  of  ODBMSs  and  models 
have  been  proposed  for  both  mandatory  access  con¬ 
trols  [15,  18,  23]  and  discretionary  access  controls  [21]. 
However,  much  work  is  still  needed  in  this  area.  In 
particular,  even  if  some  approaches,  developed  for  re¬ 
lational  DBMSs,  can  be  directly  applied  to  ODBMSs, 
new  security  problems  arise  that  are  specific  to  objects 
manipulation  in  ODBMSs.  We  believe  that  addressing 
such  security  issues  is  important  given  the  relevance 
of  object  technology  in  the  current  and  next  genera¬ 
tions  of  DBMSs.  Moreover,  security  is  today  an  im¬ 
portant  concern  in  many  object-oriented  platforms  for 
distributed  computing  and  application  development, 
as  witnessed  by  recent  efforts  for  developing  a  secu¬ 
rity  standard  for  CORBA  [7].  Therefore,  even  though 
we  cast  our  research  in  the  framework  of  ODBMSs, 
results  of  our  research  can  be  applied  to  other  object 
systems. 


An  important  issue  in  ODBMSs  is  related  to  ob¬ 
ject  deletion.  Two  different  approaches  are  used  by 
existing  ODBMS  to  enforce  object  deletion;  under  the 
first  approach  users  are  allowed  to  explicitly  delete 
objects;  under  the  second  approach  a  garbage  collec¬ 
tion  mechanism  is  used  by  which  an  object  is  removed 
by  the  system  when  no  longer  reachable  by  other  ob¬ 
jects.  Because  under  the  latter  approach  no  explicit 
delete  operation  is  provided  at  application  level,  ref¬ 
erential  integrity  is  ensured.  However,  object  deletion 
and  garbage  collection  are  operations  that,  if  not  prop¬ 
erly  implemented,  could  be  exploited  as  covert  chan¬ 
nels,  thus  bypassing  the  access  controls  usually  imple¬ 
mented  by  ODBMSs.  An  important  requirement  for 
those  operations  is  therefore  to  be  secure  from  covert 
channels  and,  at  the  same  time,  to  ensure  referential 
integrity  among  objects  in  the  database.  Because  of 
the  relevance  of  garbage  collection  in  object  systems, 
several  algorithms  have  been  proposed  for  both  cen¬ 
tralized  and  distributed  systems  [13,  17,  19].  Here,  we 
continue  our  investigation  in  object  deletion  and  se¬ 
cure  geirbage  collection  [4].  We  first  show  how  object 
deletion  and  garbage  collection  could  be  illegally  ex¬ 
ploited  to  perform  unauthorized  data  accesses;  then, 
we  introduce  some  principles  ensuring  a  secure  delete 
operation.  Moreover,  we  present  a  garbage  collection 
protocol  which  is  secure  against  covert  channels.  The 
main  differences  between  the  work  presented  in  this 
paper  and  our  previous  work  [4]  can  be  summarized 
as  follows.  First,  here  we  provide  a  formal  setting  to 
address  secure  object  delete  operations.  Second,  in  our 
previous  paper  the  copying  approach  to  garbage  col¬ 
lection  was  considered.  Here,  we  consider  a  different 
approach,  based  on  the  mark-and-sweep  technique, 
and  show  how  the  proposed  approach  is  secure  with 
respect  to  the  formal  setting.  In  particular,  the  for¬ 
mal  setting  we  propose  consists  of  a  number  of  princi¬ 
ples  forming  the  basic  guidelines  for  secure  object  dele¬ 
tion  and  garbage  collection.  These  guidelines  provide 
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a  concrete  embodiment  of  the  general  Bell-LaPadula 
principles  [1]  for  the  specific  case  of  object  deletion 
and  garbage  collection.  We  believe  that  such  guide¬ 
lines  are  an  important  step  towards  the  development 
of  secure  object  systems. 

The  remainder  of  this  paper  is  organized  as  follows. 
Section  2  briefly  recalls  the  basic  concepts  of  multilevel 
security  and  outlines  the  object  model  we  refer  to  in 
this  paper.  Section  3  describes  the  create  operation  in 
the  framework  of  the  object  model  presented  in  Sec¬ 
tion  2.  Section  4  introduces  the  problem  of  object 
deletion  in  a  secure  object  environment  and  provides 
rules  for  secure  object  deletion.  Section  5  analyzes 
the  mark-and-sweep  garbage  collection  protocol  with 
respect  to  the  principles  presented  in  Section  4.  Fi¬ 
nally,  Section  6  concludes  the  paper  and  outlines  fu¬ 
ture  work. 

2  Preliminary  Concepts 
In  this  section  we  first  recall  the  basic  concepts 
of  multilevel  security  and  describe  the  message  filter 
model  [15].  Moreover,  we  briefly  characterize  the  ref¬ 
erence  object  model  we  refer  to  in  the  paper. 

2.1  The  Multilevel  Security  Model 

The  system  consists  of  a  set  O  of  objects  [passive 
entities),  a  set  S  of  subjects  {active  entities)^  and  a  set 
Lev  of  security  levels  with  a  partial  ordering  relation 
<.  A  level  Li  is  dominated  by  a  level  Lj  if  Li  < 
Lj.  Moreover,  a  level  Li  is  strictly  dominated  by  a 
level  Lj  (written  Li  <  Lj)  if  I,-  <  Lj  and  i  ^  j. 
We  say  that  two  levels  L,*  and  Lj  are  fncompara6/e 
(written  Li  <>  Lj)  if  neither  Li  <  Lj  nor  Lj  <  Li 
holds,  A  total  function  £,  ceJled  security  classification 
function,  is  defined  from  OUS  to  Lev.  Given  an  object 
o,  function  £  returns  the  security  classification  of  o. 
Similarly,  given  a  subject  s,  £(s)  denotes  the  security 
classification  of  s. 

A  secure  system  enforces  the  Bell-LaPadula  restric¬ 
tions  that  can  be  stated  as  follows  [1]: 

1.  A  subject  s  is  allowed  to  read  an  object  o  if  and 
only  if  £(o)  <  £(s)  (no-read-up). 

2.  A  subject  s  is  allowed  to  write  an  object  o  if  and 
only  if  £(s)  <  £(o)  (no- write-down). 

The  second  property  is  also  known  as  the  *-property 
and  prevents  leakage  of  information  due  to  Trojan 
Horses.  Additional  details  can  be  found  in  [8]. 

2.2  The  Reference  Object  Model 

An  object  database  consists  of  a  set  of  objects  ex¬ 
changing  information  via  messages.  An  object  consists 
of  a  unique  object  identifier  (oid),  which  is  fixed  for 
the  whole  life  of  the  object,  and  a  set  of  attributes, 


whose  values  represent  the  state  of  the  object.  The 
value  of  an  attribute  can  be  an  object  or  a  set  of  ob¬ 
jects.  Moreover,  an  object  has  a  set  of  methods  encap¬ 
sulating  the  object  state.  Methods  are  used  to  modify 
the  state  of  the  object  or  to  perform  other  types  of 
computation  on  the  object. 

An  object  can  be  primitive  (like  an  integer,  or  a 
character),  or  can  be  built  from  other  objects  (ei¬ 
ther  primitive  or  non-primitive).  We  denote  a  non¬ 
primitive  object  as  a  triple  (oxd,  state, meths),  where: 

•  oid  is  the  object  identifier; 

•  state  =  (ai  :  1/1,02  :  V2i...,On  •  Vn),^  where  a,* 
is  an  attribute  name  (the  names  of  object  at¬ 
tributes  must  be  distinct),  and  Vi  is  the  value  of 
attribute  a,-  and  can  be  a  primitive  object  or  an 
OID,  1  =  2, ...,n.  The  possible  values  that  an 
object  attribute  may  take  are  specified  in  the  def¬ 
inition  of  the  class  to  which  the  object  belongs 
to; 

•  meths  is  a  set  of  method  names. 

Let  o  and  o^  be  two  objects.  We  say  that  0  is  a 
high-level  {loxv-levet)  object  with  respect  to  o',  if  £(o') 
<  £(o)  (£(o)  <  £(o')).  Similarly,  let  o  and  0'  be  two 
objects  such  that  o  stores  in  one  of  its  attribute  the 
OID  of  o'.  We  say  that  the  OID  of  o'  is  a  high-level  {low- 
level)  OID  with  respect  to  o  if  £(o)  <  £(o')2  (£(o')  < 
£(o)). 

Whenever  an  object  o  has  as  value  of  one  of  its 
attributes  the  OID  of  an  object  o',  we  say  that  o  ref¬ 
erences  o'. 

In  our  model,  no  reference  is  allowed  among  objects 
at  incomparable  security  levels.  Moreover,  we  make 
the  assumption,  common  to  most  proposals  [15,  18, 
23],  that  all  objects  are  single-level  and,  therefore,  all 
attributes  of  an  object  have  the  same  security  level. 
Multilevel  objects  can  easily  be  represented  in  terms 
of  single-level  objects;  we  refer  the  reader  to  [2,  3] 
for  a  detailed  discussion  on  this  issue.  [2,  3]  do  not 
address  the  problem  of  object  creation;  however,  the 
creation  of  a  multilevel  object  can  be  performed  by 
creating  a  single-level  object  for  each  distinct  security 
level.  The  creation  of  these  objects  can  be  performed 

^  We  make  the  assumption  that  non-primitive  objects  arc 
built  using  the  tuple  constructor.  Other  constructors  may  also 
be  used,  like  the  set  and  list  constructors  [5].  However,  the 
specific  constructor  type  used  is  not  relevant  for  the  present 
discussion. 

*This  is  possible  because  we  allow  an  object  to  create  objects 
at  strictly  higher  levels  (see  Section  3  for  more  detsuls  on  the 
create  operation). 
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using  a  covert-channel  free  OID  generation  mechanism 
as  the  one  illustrated  in  Section  3. 

The  methods  of  an  object  can  be  invoked  by  sending 
a  message  to  the  object.  Upon  the  reception  of  the 
message,  the  corresponding  method  is  executed,  and 
a  reply  is  returned  to  the  object  sending  the  message. 
The  reply  can  be  either  an  OID,  a  primitive  object,  or 
a  special  nil  value,  that  denotes  that  no  information 
is  returned. 

The  invocation  of  a  method  m  on  the  reception  of 
a  message  can  be  either  synchronous  or  asynchronous. 
In  the  former  case,  the  sender  waits  for  the  reply  value, 
that  is,  it  is  suspended  until  the  invoked  method  ter¬ 
minates.  In  the  latter  case,  a  nil  reply  value  is  im¬ 
mediately  returned  to  the  sender  which  will  be  exe¬ 
cuted  concurrently  with  the  receiver;  the  sender  will 
be  able  to  get  the  reply  value  successively.  In  this  pa¬ 
per  we  do  not  distinguish  between  synchronous  and 
asynchronous  method  invocations.  Moreover,  we  as¬ 
sume  that  method  invocations  are  performed  sequen¬ 
tially  during  a  user  session  within  the  system,  that  is, 
a  user  cannot  invoke  parallel  method  executions. 

The  fact  that  messages  are  the  only  means  by  which 
objects  can  exchange  information  makes  information 
flow  in  object  systems  have  a  very  concrete  and  natu¬ 
ral  embodiment  in  terms  of  messages  and  their  replies 
[15].  Thus,  information  flow  in  object  systems  can 
be  controlled  by  mediating  message  exchanges  among 
objects. 

2,3  The  Message  Filter  Model 

The  Bell-LaPadula  model  has  been  applied  to  the 
object  model  by  means  of  the  message  filter  [15].  Un¬ 
der  this  approach  all  messages  exchanged  among  ob¬ 
jects  in  the  system  are  filtered  according  to  the  follow¬ 
ing  rules; 

1.  If  the  sender  of  the  message  is  at  a  level  strictly 
dominating  the  level  of  the  receiver,  the  method 
invoked  by  the  message  is  executed  by  the  receiver 
in  restricted  modcy  that  is,  no  update  can  be  per¬ 
formed.  More  precisely,  a  restricted  mode  exe¬ 
cution  at  a  level  I  should  be  memoryless  at  level 
/.  Therefore,  even  though  the  receiver  can  see 
the  message,  the  execution  of  the  corresponding 
method  on  the  receiver  should  leave  the  state  of 
the  receiver  (as  well  as  of  any  other  object  at  a 
level  not  dominated  by  the  level  of  the  receiver) 
as  it  was  before  the  execution. 

2.  If  the  sender  of  the  message  is  at  a  level  strictly 
dominated  by  the  level  of  the  receiver,  the  method 
is  executed  by  the  receiver  in  normal  mode,  but 


the  returned  value  is  niL  To  prevent  timing  chan¬ 
nels,  the  nil  value  is  returned  to  the  sender  before 
actually  executing  the  method. 

The  first  principle  ensures  that  an  object  does  not 
write-down,  whereas  the  second  one  ensures  that  an 
object  does  not  read-up.  The  message  filter  is  a 
trusted  component  of  the  object  system  in  charge 
of  enforcing  the  above  principles  on  all  message  ex¬ 
changes  among  objects.  Note  that,  according  to  the 
reference  object  model,  an  object  is  allowed  to  refer¬ 
ence  a  high-level  object;  this  means  that  an  object 
may  have  as  value  of  one  of  its  attributes  a  high-level 
OID.  This  possibility  allows  an  object  to  send  informa¬ 
tion  to  objects  at  higher  levels.  However,  since  every 
message  is  intercepted  by  the  message  filter,  this  pos¬ 
sibility  does  not  violate  the  overall  security  of  the  sys¬ 
tem  since  read-up  operations  will  always  return  a  nil 
response  value.  Moreover,  an  object  of  level  Li  may 
only  stores  the  oiDs  of  the  high-level  objects  whose 
creation  has  been  requested  by  a  level  lower  than  or 
equal  to  level  Li.  The  mechanism  we  adopt  for  OIDS 
generation  is  described  in  the  following  section. 

3  Create  operation 

From  a  security  perspective,  create  is  an  important 
operation  since  it  establishes  the  visibility  of  object 
OIDS  across  security  levels.  The  create  operation  al¬ 
lows  a  subject  (or  an  object)  to  create  zui  object  with 
a  security  level  higher  than  or  equal  to  the  level  of  the 
creator.  Obviously,  a  subject  (object)  cannot  create 
objects  at  strictly  lower  levels  tham  its  security  level. 
The  create  message  has  as  arguments  the  list  of  at¬ 
tribute  values  (either  primitive  objects  or  OlDs)  and 
the  security  level  to  be  assigned  to  the  created  object. 
The  OID  of  the  created  object  is  returned  to  the  ob¬ 
ject  that  has  requested  the  creation.  A  consequence 
of  this  approach  is  that  objects  may  store,  as  part  of 
their  state,  OIDS  of  objects  at  higher  levels. 

We  make  several  assumptions  about  the  oiDs. 
First,  OIDS  are  logical,  that  is,  they  do  not  contain 
information  about  the  physical  location  of  the  corre¬ 
sponding  objects.  Given  an  OID,  a  hash  table  is  used 
to  determine  its  physical  location.  Second,  there  is  a 
separate  OID  generation  mechanism  at  each  level.  The 
OIDS  generated  at  a  level  L  are  for  the  object  whose 
creation  has  been  required  at  level  L.  Finally,  we  as¬ 
sume  that  the  OID  of  each  object  also  contains  the 
security  level  assigned  to  the  object  upon  its  creation. 
This  assumption  does  not  introduce  any  security  flaw, 
since  the  level  of  the  object  is  specified  as  part  of  the 
create  operation.  Thus,  the  OID  of  an  object  o  con¬ 
sists  of  three  components:  L,  V  and  c,  where  L  is  the 
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level  of  the  creator  of  o,  V  is  the  level  assigned  to  o 
upon  its  creation,  with  L  <  L\  and  c  is  an  integer 
that  uniquely  identifies  o  at  level  L 

4  Secure  Delete  operation 

We  start  by  describing  the  most  common  ap¬ 
proaches  to  object  deletion  and  the  security  problems 
the  delete  operation  can  cause.  Then,  we  present  a  set 
of  principles  ensuring  the  security  of  the  delete  opera¬ 
tion  for  mandatory  access  control.  Finally,  we  discuss 
implementation  issues. 

4.1  Delete  Operation 

Existing  ODBMSs  use  diiferent  approaches  with  re¬ 
spect  to  the  delete  operation.  There  are  two  categories 
of  systems:  systems  supporting  explicit  delete  opera¬ 
tions  (like  Orion  [12]  and  Iris  [11]),  and  systems  us¬ 
ing  a  garbage  collection  mechanism  (like  02  [10]  and 
Gemstone  [16]).  A  garbage  collector  is  a  piece  of  soft¬ 
ware  that  deletes  objects  no  more  accessible.  There 
is  a  special  object,  called  root,  which  is  always  per¬ 
sistent.  All  objects  that  can  be  reached  from  the  root 
by  traversing  (directly  or  indirectly)  object  references, 
are  persistent.  An  object  is  removed  when  it  can  no 
longer  be  reached  from  the  root. 

If  the  delete  operation  is  not  properly  executed, 
covert  channels  may  be  established. 


Figure  1:  An  object  o^  with  references  from  its  own 
level  and  a  higher  level 

Example  1  Consider  two  objects  o  and  o*  such  that 
C{o)  =  C(o*)  =  L\,  and  an  object  o"  such  that 
£(o")  =  L2t  with  L\  <  L2-  Suppose  that  both  o  and 
o"  reference  object  o'.  The  example  is  graphically  iU 
lustrated  in  Figure  L  Suppose  that  the  reference  from 
o  to  o'  is  removed,  because  o  has  been  deleted.  If  a 
garbage  collection  approach  is  used,  object  o'  would  not 
be  deleted  since  there  is  still  another  object  (i.e.,  &*) 
referencing  it.  Therefore  two  subjects  at  levels  Li  and 
L2,  respectively  could  exploit  this  fact  to  establish  a 
covert  channel  The  subject  at  level  Lx  would  create 
two  objects  o  and  o',  at  its  own  level,  such  that  o  ref- 
erences  o'.  Then,  the  subject  at  level  L2  would  create 
an  object  o",  at  its  own  level,  such  that  o"  references 


o'.  Then,  after  an  amount  of  time  pre-defined  by  the 
two  subjects,  the  subject  at  level  L\  would  remove  the 
reference  from  o  to  of .  If,  after  the  reference  has  been 
removed,  object  of  still  exists,  this  situation  is  inter¬ 
preted  as  1.  By  contrast,  if  object  of  is  removed,  this 
situation  is  interpreted  as  0.  Note  that  the  subject  at 
level  Li  would  simply  need  to  check  storage  occupancy 
to  determine  whether  of  still  exists. 

Exploiting  the  above  covert  channel  requires  col¬ 
lusion  of  two  subjects  at  different  levels.  Note,  how¬ 
ever,  that  this  is  a  common  situation  for  many  types  of 
covert  channels.  See  as  an  example,  covert  channels 
exploiting  concurrency  control  mechanism  in  DBMS 
[14]. 

Moreover,  whenever  storage  is  deallocated  because 
of  object  deletion,  the  problem  of  dangling  references 
may  arise.  A  dangling  reference  occurs  when  there  is 
a  reference  to  storage  that  has  been  deallocated.  In 
systems  with  explicit  delete  operations  dangling  refer¬ 
ences  may  arise  since  an  object  can  be  removed  even 
if  there  are  references  to  it.  In  a  garbage  collection  en¬ 
vironment,  an  untrusted  collector  could  intentionally 
remove  an  object  to  create  a  dangling  reference. 

A  security  problem  is  that  dangling  references  can 
be  used  to  establish  covert  channels,  as  the  following 
example  shows. 

Example  2  Consider  Figure  2(a).  If  object  0'  is 
deleted  by  a  subject  at  level  L2f  a  dangling  reference 
appears  in  object  o  at  level  Li  <  L2  (Figure  2(b)).  A 
subject  at  level  h  could  infer  the  deletion  of  object  d 
by  trying  to  send  a  write  message  to  the  object.  On  the 
basis  of  the  result  of  such  operation  ( run-time  error  or 
successful  update),  the  subject  at  level  Lx  gets  one  bit 
of  information  from  a  higher  security  level. 

Thus,  the  deletion  of  objects  referenced  by  low-level 
objects  can  be  exploited  by  low-level  subjects  to  infer 
information  from  high-level  objects.  A  subject  at  a  se¬ 
curity  level  L2  could  delete  a  subset  of  the  objects  ref¬ 
erenced  by  objects  at  a  security  level  Lx  <  L2.  Then 
a  subject  at  level  Lx  could  try  to  access  all  high-level 
objects  resulting  in  a  set  of  unsuccessful-successful  ac¬ 
cesses.  Hence,  an  arbitrary  string  of  bits  of  reserved  in¬ 
formation  could  be  transmitted  from  a  higher  security 
level.  Note  that,  in  a  garbage  collection  environment 
an  untrusted  collector  could  intentionally  remove  the 
objects  at  level  L2  referenced  by  low-level  objects  in 
order  to  establish  a  covert  channel. 

As  Examples  1  and  2  above  show,  there  are  many 
ways  in  which  a  delete  operation  can  be  exploited  to 
establish  a  covert  channel.  However,  it  is  important 
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a)  b) 

Figure  2:  An  object  o  referencing  a  high-level  object 

to  note  that,  when  an  object  is  deleted,  the  only  side 
effects  on  the  database  are  a  state  transition  of  the 
database  itself  and,  possibly,  the  generation  of  dan¬ 
gling  references.  Therefore,  the  only  means  to  es¬ 
tablish  a  covert  channel  exploiting  object  deletion  are 
those  stated  by  the  following  definition. 

DeRnition  1  (Delete  Covert  Channel)  A  delete 
covert  channel  is  a  covert  channel  established  by  one 
of  the  following  means:  (i)  exploiting  dangling  refer¬ 
ences;  (ii)  monitoring  the  state  of  the  system  with  re¬ 
spect  to  object  or  memory  allocation;  ^  or  (Hi)  per¬ 
forming  intentional  data  scavenging.^ 

In  the  above  definition,  with  the  term  state  of  the 
system  we  refer  to  the  state  of  all  objects  stored  in  the 
system  and  all  the  information  related  to  the  system 
itself,  such  as  memory  allocation  eind  method  error 
codes. 

Moreover,  delete  operations  can  be  regarded  as  a 
form  of  write  operations:  depending  on  the  specific 
delete  approach  used,  information  may  have  to  be 
set  into  the  object  being  deleted  (such  as  reference 
counts).  It  is  therefore  necessary  to  avoid  that  they 
can  be  exploited  to  establish  a  T>ojan  Horse.  The  Tro¬ 
jan  Horse  is  simply  established  by  having  at  a  high- 
level  some  piece  of  code  allowing  or  disallowing  dele¬ 
tions  of  lower  level  objects  or  by  plainly  writing  sen¬ 
sitive  data  into  lower-level  system  information,  used 
when  removing  the  object  (we  call  it  Delete  Trojan 
Horse).  The  above  considerations  lead  to  the  follow¬ 
ing  definition  of  secure  delete  operation. 

Definition  2  (Secure  Delete  Operation)  A  delete 
operation  is  secure  if  and  only  if  it  cannot  be  exploited 
to  establish  a  delete  covert  channel  or  a  delete  Trojan 
Horse. 

^  When  we  speak  of  object  allocation  rather  than  memory  al¬ 
location,  we  mean  information  about  whether  or  not  memory  is 
allocated  to  a  given  object  regardless  of  the  amount  of  memory 
it  uses  (e.g.  the  result  of  a  query  looking  for  the  instances  of  a 
given  class). 

^Accesses  to  system  resources  such  as  memory  pages  and 
disk  sectors  no  more  allocated.  See  also  object  reuse  in  [6]. 


An  important  question  is  whether  there  could  be 
other  circumstances,  besides  the  ones  considered  in 
Definition  2,  leading  to  insecure  delete  operations. 
We  believe  not.  Indeed,  illegal  information  flow  may 
arise  in  two  cases:  1)  write-down  operations,  which 
for  delete  operations  mean  that  a  high-level  subject 
may  cause  or  prevent  the  deletion  of  a  low-level  ob¬ 
ject,  or  write  sensitive  data  into  lower-level  system 
information.  This  case  has  been  identified  as  Delete 
TYojan  Horse  in  the  above  definition;  2)  read-up  oper¬ 
ations,  which  for  delete  operations  mean  that  dangling 
references®  may  arise,  or  that  some  low-level  subjects 
may  read  high-level  information  about  memory  occu¬ 
pancy  or  de-allocated  areas.  All  these  situations  have 
been  identified  as  delete  covert  channels  in  Definition 
1.  Note  that  completeness  of  Definitions  1  and  2  is 
based  on  observing  that  a  delete  operation  consists  of 
two  steps:  (a)  logically  removing  the  object  (and  then 
checking  references  to  the  object);  (b)  physically  re¬ 
moving  the  object  (and  thus  de-allocating  the  storage 
allocated  to  the  object).  Our  definitions  are  based  on 
analysis  of  security  threats  that  can  arise  in  the  above 
steps. 

Even  though  the  delete  operation  can  be  thought 
of  as  a  form  of  write,  because  of  the  many  ways  the 
delete  operation  is  implemented  in  object  systems,  it 
is  important  to  establish  some  basic  principles,  enforc¬ 
ing  secure  delete  operations,  underlying  any  possible 
implementation  of  the  delete  operation.  These  princi¬ 
ples  are  the  topic  of  the  following  subsection. 

4.2  Security  Principles  for  Object  Dele¬ 
tion 

In  the  following  we  define  a  set  of  principles,  re¬ 
ferred  to  as  security  principleSj  ensuring  the  security  of 
a  delete  operation.  These  principles  state  what  needs 
to  be  done  by  the  Trusted  Computing  Base  ( TCB)  [6] 
to  prevent  illegal  flows  of  information  due  to  object 
deletion,  rather  than  how  it  will  actually  be  imple¬ 
mented.  For  instance,  Figure  2  only  shows  how  to 
exploit  dangling  references  to  establish  a  delete  covert 
channel,  regardless  of  implementation  details.  Indeed, 
the  TCB  can  easily  block  these  illegal  flows  by  simply 
using  a  strategy  like  the  one  discussed  in  Subsection 

2.3  for  handling  messages  sent  to  high-level  objects. 
According  to  such  strategy,  a  message  sent  from  a 
low-level  object  to  a  high-level  object  always  returns 
ni(  independently  from  the  actual  execution  of  the 
method  invoked  by  the  message. 

We  do  not  make  any  assumption  whether  deletion 
is  implicit  or  explicit  or  whether  referential  integrity 
is  enforced.  In  order  to  make  our  approach  widely 

*This  because  the  delete  operation  just  removes  objects. 
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applicable,  we  do  not  assume  a  particular  mandatory 
security  model  and  start  from  the  basic  mandatory 
access  control  principles  introduced  in  Subsection  2.1. 
Moreover,  our  principles  do  not  assume  any  system 
architecture  (single-subject  vs  kernelized). 

Since  delete  operations  can  be  regarded  as  a  form 
of  write  operations,  deleting  a  low-level  object  can  be 
interpreted  as  a  violation  of  the  ^-property.  Hence, 
we  suggest  the  following  principle: 

Principle  1  (No  Delete  Down)  An  object  o  can 
cause  the  deletion  of  an  object  o'  if  and  only  if 
^o)<C{o'), 

Note  that  we  have  used  ‘can  cause  the  deletion’ 
rather  than  ‘can  delete’,  because  in  a  garbage  collec¬ 
tion  environment  an  object  can  only  cause  the  dele¬ 
tion  of  another  object  by  updating  all  references  to 
the  given  object,  causing  its  deletion  by  the  collector. 
In  systems  supporting  explicit  deletions,  an  object  can 
cause  the  deletion  of  another  object  by  issuing  a  delete 
command. 

As  we  have  seen  in  Example  1,  an  object  o  at  level 
L  could  be  referenced  by  several  high-level  objects  and 
these  references  from  high-level  objects  to  low-level  ob¬ 
jects  could  be  used  to  establish  a  delete  covert  channel. 
In  order  to  prevent  this  type  of  problem,  the  following 
principle  is  established: 

Principle  2  (No  Interference  from  High  to 
Low)  If  an  object  o  is  referenced  by  high-level  objects 
and  by  no  object  of  level  V  <  £(0),  a  delete  opera¬ 
tion  invoked  on  the  object  o  from  level  C{o)  cannot  be 
prevented. 

In  Example  2,  dangling  references  from  low-level 
objects  to  high-level  objects  are  used  to  infer  higher 
level  data.  However,  oiDs  referencing  high-level  ob¬ 
jects  are  needed  if  write-up  is  allowed  by  the  security 
model.  Therefore,  we  propose  the  following  principle: 

Principle  3  (No  Dangling  References  from  Low 
to  High  because  of  High-Level  Deletions)  A 
delete  operation  invoked  from  level  L  on  an  object  o'  of 
level  L' j  L  <  L',  must  not  be  allowed  if  there  exists  at 
least  an  object  o  such  that  C(6)  <  L  and  o  references 
o'. 

Note  that  Principle  3  does  not  forbid  the  deletion 
of  an  object  o,  referenced  by  low-level  objects,  if  this 
deletion  is  required  by  an  object  at  a  level  dominated 
by  all  the  levels  of  the  objects  referencing  o.  Indeed, 
the  dangling  references  arising  from  this  deletion  can¬ 
not  be  exploited  as  a  covert  channel  trying  to  access 
the  deleted  object. 


As  stated  by  Definition  1,  dangling  references  are 
not  the  only  mean  of  establishing  a  delete  covert  chan¬ 
nel.  For  instance,  using  an  untrusted  garbage  collec¬ 
tor  that  acts  on  the  entire  database,  an  object  at  level 
L  could  infer  the  deletion  of  an  object  at  level  L', 
L  <  L',  by  monitoring  the  system  resources.  Hence, 
system  resources  must  be  controlled  siccording  with 
the  following  principle: 

Principle  4  (No  Global  Information)  Informa¬ 
tion  about  system  resources  at  security  level  L  can  be 
made  available  to  an  object  o  if  and  only  if  L  <  £(<?). 

It  is  important  to  note  that  the  no  read-up  princi¬ 
ple  is  normally  intended  as  a  restriction  imposed  on 
the  operations  acting  on  the  database,  whereas  Princi¬ 
ple  4  states  a  more  general  rule  to  avoid  also  leakage  of 
information  due  to  system  information,  like  memory 
allocation. 


Figure  3:  An  object  referenced  from  multiple  levels 


Example  3  The  interplay  among  the  given  principles 
is  illustrated  with  the  help  of  Figure  3.  Here,  an  object 
02  is  referenced  by  a  high-level  object  03  and  a  low- 
level  object  oi .  Suppose  now  that  a  subject  at  level  L2 
invokes  the  deletion  of  object  02.  Under  the  security 
principles,  object  02  is  not  deleted,  since  its  deletion 
would  violate  Principle  3.  Note  that,  even  if  the  dele¬ 
tion  of  object  02  is  not  allowed,  Principle  2  is  satisfied 
too,  because  object  02  is  referenced  by  both  a  high-level 
object  (i.e,,  object  oz)  and  a  low-level  object  (Le.,  ob¬ 
ject  oi).  Moreover,  if  a  subject  at  level  h  invokes 
a  delete  operation  on  object  02,  this  operation  is  al¬ 
lowed  since  the  deletion  of  object  02  does  not  violate 
any  principle.  In  particular.  Principle  3  is  satisfied 
because  the  deletion  is  invoked  from  level  Li .  Indeed, 
the  dangling  references  caused  by  this  deletion,  that  is, 
the  references  from  03  to  02,  and  from  oi  to  02  cannot 
be  exploited  as  a  delete  covert  channel. 

The  correctness  of  the  above  security  principles  is 
stated  by  the  following  proposition. 


Proposition  1  A  delete  operation  is  secure  iff  Prin¬ 
ciples  1-4  satisfied. 

Proof.  We  first  prove  the  if  part  of  the  thesis.  We 
suppose  that  the  implication  does  not  hold  and  derive 
a  contradiction.  Suppose  that  the  delete  operation  is 
secure,  and  that  one  among  Principles  1-  4  is  not 
satisfied. 

•  It  is  trivial  to  prove  that  if  Principle  1  is  not  sat¬ 
isfied  the  delete  operation  is  not  secure,  since  a 
delete  Trojan  Horse  could  be  established. 

•  If  Principle  2  is  not  satisfied,  the  delete  operation 
is  not  secure,  since  a  delete  covert  channel,  like 
the  one  in  Example  1,  could  be  established. 

•  If  Principle  3  is  not  satisfied,  the  delete  operation 
is  not  secure,  since  a  delete  covert  channel,  like 
the  one  illustrated  in  Example  2,  could  be  estab¬ 
lished. 

•  If  Principle  4  is  not  satisfied,  a  delete  covert  chan¬ 
nel  could  be  established  by  monitoring  the  system 
resources  at  a  higher  security  level,  and  hence  in¬ 
ferring  the  deletion  of  high-level  objects. 

Thus,  in  all  the  above  cases,  a  contradiction  arises. 

We  now  consider  the  other  part  of  the  implication. 
We  suppose  that  the  implication  does  not  hold  and 
derive  a  contradiction.  Suppose  that  Principles  1-  4 
are  satisfied  and  the  delete  operation  is  not  secure. 
According  to  Definition  2,  this  is  equivalent  to  sup¬ 
pose  that  Principles  1-  4  are  satisfied  and  the  delete 
operation  can  be  exploited  to  establish  either  a  delete 
covert  channel  or  a  delete  Trojaui  Horse. 

•  Suppose  that  the  delete  operation  can  be  ex¬ 
ploited  to  establish  a  delete  Trojan  Horse.  It  is 
easy  to  show  that  this  implies  that  the  delete  op¬ 
eration  does  not  satisfy  Principle  1,  because  Prin¬ 
ciple  1  is  a  specialization  of  the  *-property  to  the 
delete  operation  context.  Thus,  a  contradiction 
arises. 

•  Suppose  that  the  delete  operation  can  be  ex¬ 
ploited  to  establish  a  delete  covert  channel.  Ac¬ 
cording  to  Definition  1,  a  delete  covert  channel 
can  be  established  only  by  exploiting  dangling 
references,  ii)  monitoring  the  state  of  the  system 
with  respect  to  object  or  memory  allocation,  or 
Hi)  performing  intentional  data  scavenging.  We 
consider  each  of  the  above  cases  separately. 


-  Suppose  that  the  delete  operation  can  be 
exploited  to  establish  a  delete  covert  chan¬ 
nel  by  means  of  dangling  references  arising 
from  the  delete  operation.  Downwards  dan¬ 
gling  references,  as  well  as  dangling  refer¬ 
ences  within  the  same  level  cannot  be  ex¬ 
ploited  to  establish  a  covert  channel,  since 
they  cannot  be  used  to  transfer  information 
from  a  security  level  to  lower  security  levels. 
Hence,  the  delete  operation  can  be  exploited 
to  establish  a  covert  channel  only  if  it  results 
in  upwards  dzmgling  references.  Thus,  sup¬ 
pose  that  the  delete  operation  results  in  a 
dangling  reference  from  a  level  Li  to  a  level 
L2j  Li  <  L2-  This  means  that  the  delete 
operation  removes  an  object  from  level  i2- 
A  delete  covert  channel  can  be  established 
only  if  the  deletion  is  required  from  a  level  V 
strictly  dominating  level  Li  because  only  in 
this  case  the  dangling  reference  can  be  used 
to  transfer  higher-level  information,  i.e.  in¬ 
formation  at  level  i',  to  a  subject  at  level 
Li .  If  the  delete  operation  is  invoked  from  a 
level  strictly  dominating  level  L2,  Principle 
1  is  not  satisfied.  If  the  deletion  is  required 
from  a  level  dominated  by  L2,  a  violation  of 
Principle  3  arises.  Hence,  in  both  cases,  a 
contreuliction  arises. 

-  Suppose  that  the  delete  operation  can  be 
exploited  to  establish  a  delete  covert  chan¬ 
nel  by  monitoring  the  state  of  the  system 
with  respect  to  object  or  memory  allocation 
or  by  performing  intentional  data  scaveng¬ 
ing.  In  this  case,  a  delete  covert  channel 
can  be  established  only  if  the  above  opera¬ 
tions  can  be  used  to  infer  higher-level  infor¬ 
mation.  This  is  possible  only  in  two  cases: 
1)  a  subject  at  a  given  level  performs  the 
above  operations  at  security  levels  strictly 
dominating  its  security  level.  However,  in 
this  case,  Principle  4  is  not  satisfied,  which 
contradicts  the  assumption;  2)  the  execution 
of  the  above  operations  at  a  given  security 
level  allows  higher-level  information  to  be 
inferred,  that  is,  the  result  of  these  opera¬ 
tions  at  a  given  security  level  is  determined 
by  operations  at  higher-levels.  This  is  pos¬ 
sible  only  if  the  presence  or  absence  of  an 
object  o  at  a  given  level  is  conditioned  by 
high-level  objects,  that  is,  object  o  is  refer¬ 
enced  by  a  high-level  object.  In  this  case,  if 
o  is  not  referenced  by  low-level  objects  and 
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if  a  garbage  collection  approach  is  used,  a 
subject  at  a  level  strictly  dominated  by  the 
level  of  the  object  referencing  o  could  infer 
the  existence  of  an  object  at  a  higher-level 
by  invoking  the  deletion  of  o.  If  o  is  not 
deleted,  the  existence  of  a  high-level  object 
referencing  o  is  inferred.  Thus,  consider  two 
objects  o  and  o'  such  that  jC(o)  <  £(o'),  o' 
references  o,  and  o  is  not  referenced  by  low- 
level  objects.  Information  from  level  jC(o')  to 
lower  security  levels  can  be  transferred  only 
if  the  deletion  of  object  o  is  invoked  from  a 
level  L  strictly  dominated  by  jC(o')j  because 
only  in  this  case  the  deletion  or  non  deletion 
of  object  o  can  be  used  at  level  L  to  infer 
higher-level  information,  that  is,  information 
at  level  £(o').  If  the  level  from  which  the 
deletion  is  invoked  strictly  dominates  level 
£(o),  Principle  1  is  not  satisfied,  which  con¬ 
tradicts  the  assumption.  If  the  deletion  is 
invoked  from  a  level  L  strictly  dominated  by 
£(o),  then  a  delete  covert  channel  can  be  es¬ 
tablished  only  if  information  on  the  alloca¬ 
tion  of  object  o  is  accessible  at  level  L,  How¬ 
ever,  in  this  case  Principle  4  is  not  satisfied, 
which  contradicts  the  assumption.  Finally, 
if  the  deletion  is  required  from  level  jC(o), 
information  from  level  £(o^)  can  be  inferred 
only  if  the  deletion  is  not  allowed.  In  this 
case.  Principle  2  is  not  satisfied,  which  con¬ 
tradicts  the  assumption. 

4.3  Implementation  Issues 

As  stated  in  the  above  discussion,  the  four  prin¬ 
ciples  guarantee  the  security  of  the  delete  opera¬ 
tion.  The  description  of  a  detailed  implementation 
for  the  delete  operation  is  outside  the  scope  of  this 
paper.  Nevertheless,  it  is  possible  to  make  some 
implementation-independent  considerations  to  help  in 
securely  designing  and  implementing  such  operation. 

•  The  delete  operation  can  be  considered  a  special 
case  of  write  operation.  Therefore  a  TCB  enforc¬ 
ing  the  ^-property  verifies  Principle  1  (for  more 
details  about  garbage  collection  see  Section  5). 

•  The  delete  operation  must  neither  update  the 
state  of  an  object  nor  read  it,  but  it  must  phys¬ 
ically  remove  an  object.  An  important  require¬ 
ment  for  a  system  to  be  secure  is  that  the  basic 
storage  elements  (e.g.  disk  sectors,  memory  pages, 
etc.)  be  cleared  prior  to  their  assignment  to  an 
object  so  that  no  intentional  or  unintentional  data 
scavenging  takes  place.  The  storage  elements  can 


be  cleared  when  deallocated,  that  is,  when  an  ob¬ 
ject  is  deleted.  The  security  principles  are  defined 
disregarding  implementation  details.  Hence  we 
require  the  physical  deletion  to  be  performed  by 
the  TCB:  when  a  delete  operation  is  invoked  on 
an  object,  the  TCB  calls  a  trusted  procedure  to 
perform  the  deletion. 

#  There  are  two  possible  approaches  to  enforce 
Principle  3: 

1.  Upwards  dangling  references  are  masked 
by  concatenating  the  oiDs  with  the  secu¬ 
rity  level  where  the  object  is  allocated  [18] 
and  making  such  dangling  references  ineffec¬ 
tive  [4].  That  is,  an  object  trying  to  access  a 
high-level  object  is  returned  a  default  reply 
value  even  if  the  target  object  has  been  pre¬ 
viously  deleted.  This  can  be  performed  by 
the  TCB  that  determines  the  security  level 
from  the  OID  and  can,  therefore,  recognize  a 
high-level  OID.  This  approach  requires  that 
the  OID  contains  the  security  level  of  the  ob¬ 
ject  (cfr.  Section  3). 

2.  The  deletion  of  an  object  like  o'  in  Figure  2 
is  prevented  by  the  TCB,^ 

In  both  cases  Principle  3  is  verified.  In  particular, 
the  first  solution  makes  the  upwards  dangling  ref¬ 
erences  ineffective.  Hence  the  principle  is  merely 
satisfied  since  no  low-to-high  dangling  reference 
can  compromise  security.  Note  that  this  strategy 
can  also  be  applied  to  incomparable  security  lev¬ 
els.  Moreover,  according  to  the  first  of  the  above 
approaches,  object  02  in  Figure  3  can  be  deleted 
by  a  subject  at  level  L2  and  Principle  3  is  sat¬ 
isfied  because  the  dangling  reference  from  object 
oi  to  object  02,  arising  from  the  deletion  of  02,  is 
masked  by  the  TCB,  The  first  approach  can  also 
be  adopted  for  the  create  operation.  An  object  o 
requesting  the  creation  immediately  receives  the 
new  OID  and  the  creation  itself  is  executed  asyn¬ 
chronously:  errors  possibly  occurred  during  the 
creation  do  not  prevent  object  o  from  immedi¬ 
ately  receiving  the  new  oiD. 

Note  that,  if  the  second  approau:h  is  chosen,  the 
situation  shown  in  Figure  4  seems  to  generate 
a  security  violation.  Suppose  that  object  02  in 
Figure  4(a)  is  being  deleted  by  a  subject  at  level 
^2-  If  a  subject  at  level  L\  removes  the  OID  refer¬ 
encing  02  and  the  OID  in  object  03  is  not  updated 

®  Auxiliary  information  about  low-level  oiDs  should  be  stored 
by  the  TCB  for  this  purpose. 
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Figure  4:  A  suspicious  delete  covert  channel 


(Figure  4(b)),  the  deletion  is  executed  (Princi¬ 
ple  2),  whereas  if  a  subject  at  level  L3  removes 
the  OID  referencing  02  and  the  OID  in  object  oi 
is  not  updated  (Figure  4(c)),  the  deletion  is  de¬ 
nied  (Principle  3).  Hence,  a  subject  at  level  L2 
can  infer  a  bit  of  information  from  the  result  of 
the  deletion  (denied  or  executed).  In  particular, 
even  if  the  reply  value  of  the  delete  operation  is 
the  same  in  both  cases,  the  subject  at  level  L2 
can  infer  the  deletion  of  object  02  by  monitoring 
the  memory  allocation  at  level  ^2*  Nevertheless, 
given  a  secure  system,  the  example  stated  above 
is  not  a  delete  covert  channel.  Indeed: 

1.  In  the  example  stated  above,  three  subjects 
must  cooperate.  The  two  subjects  at  levels 
Ls  and  Li  must  know  not  only  the  way  the 
covert  channel  can  be  established,  but  they 
must  also  know  all  the  data  to  be  transmit¬ 
ted. 

2.  The  subject  at  level  L\  cannot  know  the 
data  to  be  transmitted  to  the  subject  at  level 
L2*  Such  data  cannot  be  legally  transmit¬ 
ted  from  level  L3  to  level  Li  via  the  system 
resources  -  no  write-down  and  no  reeud-up 
principles  are  enforced  by  the  TCB.  More¬ 
over,  a  security  violation  cannot  occur  -  the 
system  is  secure,  in  particular  the  security 
principles  are  enforced.  Hence,  if  the  data 
can  be  transmitted  from  level  L3  to  level  Li , 
the  system  is  not  secure  against  the  hypoth¬ 
esis, 

3.  The  subject  at  level  L3  is  forced  to  transmit 
to  the  subject  at  level  Li  what  the  subject 
wants  to  illegally  transmit  to  the  subject  at 
level  Zf2-  Hence,  if  the  example  shown  in 
Figure  4  were  a  delete  covert  channel  then 
each  information  flow  would  be  (potentially) 
illegal. 

A  TCB  satisfying  the  requirements  stated  above 
can  be  designed  on  the  basis  of  the  message  filter  ap¬ 


proach  described  in  Subsection  2.3,  Finally,  note  that 
referential  integrity  is  not  preserved  by  the  security 
principles,  because  they  deal  only  with  security.  In 
particular,  because  of  Principle  2,  dangling  references 
can  arise  which,  however,  cannot  be  exploited  as  delete 
covert  channels. 

5  Secure  Garbage  Collection 

Our  aim  is  to  achieve  referential  integrity  in  mul¬ 
tilevel  databases  by  means  of  garbage  collection. 
Garbage  collection  deals  with  reclaiming  storage  that 
was  once  used  but  is  no  longer  needed.  The  collec¬ 
tor  is  invoked  periodically  or  when  a  memory  overflow 
arises.  A  serious  drawback  with  conventional  garbage 
collection  mechanisms  is  that  the  garbage  collector 
would  have  to  access  objects  at  various  security  lev¬ 
els,  The  garbage  collector  would  therefore  have  to 
be  trusted.  We  describe  here  a  different  approach, 
based  on  the  mark-and-sweep  technique;  under  this 
approach  the  collector  is  structured  so  that  the  trusted 
part  is  minimized.  We  analyze  this  approach  with  re¬ 
spect  to  the  four  principles  for  secure  delete  opera¬ 
tions  and  we  show  that  the  garbage  collector  satisfies 
the  four  principles.  It,  therefore,  implements  a  se¬ 
cure  delete  operation,  because  of  the  results  stated  by 
Proposition  1, 

A  mark-and-sweep  collector  follows  pointers  in  the 
heap  marking  any  object  that  is  reached  (marking 
phase)  ^  then  it  collects  all  the  non-marked  objects 
(sweeping  phase)  scanning  the  heap  sequentially.  The 
marking  phase  starts  from  the  root  objects  (objects 
containing  information  always  needed).  The  root  ob¬ 
jects  are  the  “entry  points”  for  a  security  level. 

One  way  to  implement  a  multilevel  trusted  collec¬ 
tor  is  to  employ  a  TCB  [22]  which  controls  the  behav¬ 
ior  of  single-level  untrusted  collectors  and  enforces  the 
Bell-LaPadula  principles.  Therefore,  we  require  root 
objects  and  a  marking  collector  MCl  for  each  secu¬ 
rity  level  X.  The  marking  collector  MCl  is  an  object 
at  level  L  in  the  database,  which  is  activated  and  con¬ 
trolled  by  the  TCB.  The  marking  collector  MCl  exe¬ 
cutes  the  marking  phase  for  level  L,  while  all  sweeping 


phases  are  performed  by  the  TCB  to  avoid  data  scav¬ 
enging.  MCl  does  not  mark  an  object  o  at  level  L 
or  higher  if  and  only  if  all  references  to  o  from  other 
objects  at  level  L  have  been  removed  (that  is,  the  oh- 
ject  is  non-locally  reachable).  Garbage  collection  is 
managed  2u:cording  to  the  stoi> the- world  approach: 
activities  are  suspended,  garbage  is  collected  and  then 
activities  are  restarted. 

Since  marking  collectors  are  untrusted  objects,  each 
marking  collector  can  only  read  objects  or  system  in¬ 
formation  at  its  security  level  or  at  lower  levels.  In 
the  following  we  show  how  to  prevent  the  marking 
collectors  from  being  exploited  as  storage  covert  chan¬ 
nels.  Storage  covert  chzuinel  are  illegal  channels  estab¬ 
lished  via  the  exploitation  of  the  dynamic  allocation  of 
memory  or  via  data  scavenging.  For  example,  a  high- 
level  subject  could  establish  such  a  covert  channel  by 
saturating  the  memory,  to  prevent  the  normal  com¬ 
putation  of  a  low-level  subject,  which  in  turn  could 
infer  high-level  information.  To  overcome  this  draw¬ 
back,  we  adopt  the  following  solution.  System  mem¬ 
ory  (volatile  and  non-volatile)  is  divided  into  a  number 
of  partitions  of  fixed  size,  one  for  each  security  level. 
Subjects  at  level  L  can  allocate  memory  only  from  the 
partition  assigned  to  L  and  the  creation  of  a  high-level 
object  is  performed  at  the  level  requested  for  the  new 
object.  This  allocation  scheme  prevents  storage  covert 
channels  from  being  established. 

The  marking  collector  MCi  executes  a  write  op¬ 
eration  in  order  to  mark  an  object,  hence  it  is  only 
able  to  mark  objects  at  level  L  or  higher.  The  mark¬ 
ing  collector  MCl  cannot  be  aware  of  references  from 
objects  at  security  levels  higher  than  L  because  of 
the  no  read-up  restriction.  Therefore,  dangling  ref¬ 
erences  could  arise  at  security  levels  higher  than  L  af¬ 
ter  the  garbage  collection  is  completed.  The  approach 
we  propose  to  avoid  dangling  references  is  based  on 
copying  operations.  Under  this  approach,  an  object 
o,  non-locally  reachable  at  its  security  level,  is  copied 
at  higher  security  levels  as  needed.  This  mechanism 
does  not  need  to  be  trusted;  therefore  it  can  be  imple¬ 
mented  by  the  marking  collectors.  The  marking  col¬ 
lector  MCl^  builds  a  table  called  Copy  Table  to  store 
pairs  of  related  oiDs  of  the  form  (oW-oid,  new-oid), 
where  old-oid  is  the  OID  of  a  low-level  non-marked  ob¬ 
ject  while  new-oid  is  the  OID  of  its  copy  created  at 
level  L  by  the  marking  collector  MCl-  The  Copy 
Table  at  level  L  is  read  by  the  marking  collectors  at 
levels  higher  than  L  to  avoid  redundant  copies.  It  is 
sufficient  to  create  a  copy  of  an  object  o  non-locally 
reachable,  at  the  lowest  levels  where  o  is  needed  and 

^Except  for  the  lowest  level  in  the  security  lattice. 


update  all  high-level  objects  referencing  o.  When  deal¬ 
ing  with  incomparable  levels,  a  copy  is  generated  for 
each  level  as  needed. 

The  marking  collectors  are  activated  by  visiting  the 
security  lattice  on  the  basis  of  a  sequence  (Li, . . , ,  £n) 
called  visit-sequence^  where  Li  is  the  lowest  level,  Ln 
the  highest  and  for  each  ii,  Lj,  1  <  i  <  n,  1  <  j  <  i, 
Lj  <  Li  or  Lj  <>  Li,  The  visit-sequence  is  a  static 
list  associated  with  a  given  database.  When  level  L  is 
visited,  the  marking  collector  MCl  is  activated  and 
after  its  termination  the  next  security  level  in  the  visit- 
sequence  is  visited.  The  sweeping  phases  must  be 
postponed  till  the  end  of  the  marking  phases,  other¬ 
wise  an  object  could  be  removed  before  being  copied: 
when  the  last  collector  completes  its  execution,  the 
sweeping  phase  is  performed  for  all  security  levels. 

Figure  5  shows  an  example  of  this  approach:  object 
1  at  level  Li  is  not  marked  by  the  marking  collector 
MCli\  hence  it  is  copied  at  level  L2  by  the  marking 
collector  MCl2  that  adds  the  pair  (otd(l),(wd(l'))  to 
the  Copy  Table  at  level  L2,  The  OID  stored  in  object 
4  at  level  L3  and  referencing  object  1  is  updated  with 
the  OID  of  the  copy  1'  generated  at  level  L2.  This 
update  is  performed  by  the  marking  collector  AfCxs 
by  reading  the  Copy  Table  at  level  Ir2* 

This  approach  satisfies  the  security  principles  pre¬ 
viously  stated.  Suppose  that  the  marking  collectors 
correctly  execute  the  marking  phase.  We  have  that: 

•  Principle  1  is  satisfied.  The  marking  collectors 
cannot  perform  explicit  deletions.  Moreover,  they 
are  objects  under  the  control  of  the  TCB\  hence 
they  cemnot  violate  the  *-property  by  causing  a 
low-level  object  to  be  deleted. 

•  Principle  2  is  satisfied.  A  non-marked  low- 
level  object  is  copied  at  higher  security  levels,  if 
needed,  then  it  is  deleted  by  the  TCB,  By  con¬ 
trast,  if  an  object  is  locally  reachable,  it  is  marked 
and  cannot  be  deleted. 

•  Principle  3  is  satisfied.  No  low-to-high  dangling 
reference  appears  if  the  marking  collectors  exe¬ 
cute  correctly  the  marking  phase. 

•  Principle  4  is  satisfied.  The  rule  stated  by  this 
security  principle  is  enforced  by  the  TCB;  hence 
we  can  assume  that  the  principle  is  satisfied. 

Even  if  a  marking  collector  incorrectly  executes  the 
marking  phase,  no  security  violation  arises.  The  mark¬ 
ing  collector  MCl  cannot  read  information  at  higher 
or  incomparable  levels;  hence  an  incorrect  marking 
phase  can  only  generate  dangling  references  from  ob¬ 
jects  at  level  L,  The  only  dangling  references  that 
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Figure  5:  Achieving  referential  integrity  with  garbage  collection 


could  be  exploited  to  establish  a  delete  covert  channel 
are  those  referencing  high-level  objects  that  have  been 
deleted  during  the  sweeping  phases;  these  dangling  ref¬ 
erences  can  be  made  ineffective  by  the  TCB  (cfr.  Sub¬ 
section  4.3,  solution  1).  Moreover,  the  system  memory 
is  divided  into  partitions  of  fixed  size  which  is  always 
the  same.  Therefore,  it  is  not  possible  to  establish 
covert  channels  due  to  memory  saturation,  overflows 
and  so  on.  Finally,  timing  channels  due  to  the  execu¬ 
tion  of  marking  collectors  can  be  avoided  by  properly 
controlling  their  execution  time.  For  instance,  the  ex¬ 
ecution  time  of  each  marking  collector  can  be  forced  to 
be  longer  than  a  pre-defined  lower  bound,  which  can 
be  the  same  for  each  security  level.  Since  the  upper 
bound  for  the  bandwidth  of  timing  channels  is  100  bit 
per  second  (bps)  [9],  the  lower  bound  can  be  assigned 
a  value  that  does  not  cause  performance  penalty. 

It  can  be  shown  by  similar  arguments,  that  the 
copying  garbage  collection  mechanism  proposed  in  [4] 
implements  a  secure  delete  operation.  It  moreover  en¬ 
sures  referential  integrity  as  the  mark-and-sweep  tech¬ 
niques  described  here,  by  copying  at  higher  levels  ob¬ 
jects  which  are  no  longer  reaohable  by  lower  levels. 

6  Concluding  Remarks 

We  have  analyzed  issues  related  to  object  dele¬ 
tion  in  multilevel  secure  ODBMSs  and  we  have  stated 
four  security  principles  ensuring  a  secure  delete  op¬ 
eration.  These  four  principles  we  rules  that  should 
be  observed  in  designing  and  implementing  garbage 
collection  mechanisms  and  mechanisms  for  data  ma¬ 
nipulation  in  object  systems.  The  security  principles 
do  not  assume  a  particular  security  architecture,  nor 
they  define  rules  for  a  specific  implementation.  We 
have  shown  how  a  multilevel  garbage  collector  algo¬ 
rithm,  based  on  the  mark-and-sweep  technique,  can 
be  analyzed  with  respect  to  the  above  principles. 

We  believe  that  much  work  is  still  needed  in  this 
area.  An  important  question  concerns  the  computa¬ 
tional  overhead  associated  with  each  principle.  Be¬ 
cause  the  principles  are  abstract  and  are  independent 
on  any  specific  implementation,  it  is  difficult  to  assess 


the  overheatd.  In  general,  we  can  say  that  the  overhead 
is  proportional  to  the  number  of  references  that  each 
object  has  and  how  such  references  are  distributed 
across  levels.  Such  considerations  are  confirmed  by 
some  experimental  evaluations  performed  on  an  im¬ 
plementation  of  the  copying  garbage  collector.  The 
experiments  have  shown  that  the  performance  heavily 
depends  on  the  number  of  copying  operations  of  ob¬ 
jects  from  lower  levels  to  higher  levels.  The  number  of 
copying  operations  in  turn  depends  on  the  number  of 
references  from  higher-level  objects  to  lower-level  ob¬ 
jects.  Another  factor  impacting  the  performance  is  the 
structure  of  the  lattice  of  security  levels.  If  the  num¬ 
ber  of  incomparable  levels  is  high,  the  performance  is 
good,  because  the  first  phase  of  the  collector  can  be 
activated  in  parallel  for  all  the  incomparable  levels. 
By  contrast,  the  performance  is  not  optimal  when  all 
levels  are  totally  ordered.  It  is  easy  to  see  that  these 
experimental  considerations  apply  also  to  the  mark- 
and-sweep  garbage  collector  presented  in  this  paper. 

Another  important  topic  we  plan  to  investigate  is 
the  2LnaIysis  of  the  rules  stated  by  the  security  princi¬ 
ples  in  the  framework  of  relational  [20]  and  deductive 
database  systems,  and  their  extension  to  cope  also 
with  object  creation.  Finally  we  plan  to  investigate 
the  definition  of  formal  strategies  to  state  covert  chan¬ 
nels  and  their  implications  on  implementation. 
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Abstract 

Methods  are  an  important  characteristics  of  Object-oriented  databases,  Previous 
models  for  Discretionary  access-control  in  00  databases  have  considered  policies  for 
Methods  and  Inheritance.  However,  discretionary  authorization  models  do  not  provide 
the  high  assurance  required  in  systems  where  Information  flow  is  considered  a  problem. 
Mandatory  models  can  solve  the  problem  but  usually  they  are  too  rigid  for  commer¬ 
cial  applications.  Therefore  discretionary,  information-flow  control  models  are  needed, 
especially  when  transactions  containing  general  methods  invocations  are  considered. 

This  paper  first  reviews  existing  security  models  for  object-oriented  databases  with 
and  without  information-flow  control.  These  models  rely  on  the  run-time  checks  of 
every  message  transferred  in  the  system.  This  paper  uses  a  compile-time  approach 
and  presents  algorithms  for  flow  control  which  are  applied  at  Rule-administration  and 
Compile  times,  thus  saving  considerable  run-time  overhead.  Special  emphasis  is  put  on 
the  Flow-analysis  of  Methods  and  the  transactions  invoking  them. 

Keywords:  Information  Flow,  Discretionary,  Object-oriented,  Methods 


1  Introduction 

Security  is  an  important  topic  for  Databases  in  general  and  for  Object-oriented  databases 
(OODB)  in  particular  [Kim90,  Kemp94].  In  general,  authorization  mechanisms  provided  by 
commercial  DBMS  are  discretionary,  that  is,  the  grant  of  authorizations  on  an  object  to 
other  subjects  is  at  the  discretion  of  the  object  administrator. 

The  main  drawback  of  discretionary  access  control  is  that  it  does  not  provide  a  real 
assurance  on  the  satisfaction  of  the  protection  requirements,  since  discretionary  policies  do 
not  impose  any  restriction  on  the  usage  of  information  by  a  subject  who  has  obtained  it 
legally.  For  example,  a  subject  who  is  able  to  read  data  can  pass  it  to  other  subjects  not 
authorized  to  read  it.  This  weakness  makes  discretionary  policies  vulnerable  to  attacks  from 
"Trojan  horses”  embedded  in  programs.  ^  Access  control  in  mandatory  protection  systems 
is  based  on  the  “no  read-up”  and  “no  write-down”  principles  [Cast95].  Satisfaction  of  these 
principles  prevents  information  stored  in  high-level  objects  to  flow  to  lower  level  objects. 
The  main  drawback  of  mandatory  policies  is  their  rigidity  which  makes  them  unsuitable  for 
many  commercial  environments. 

There  is  the  need  of  access  control  mechanism  able  to  provide  the  flexibility  of  discre¬ 
tionary  access  control,  and  at  the  same  time,  the  high  assurance  of  mandatory  access  control. 

*the  term  "Trojan  horse”  is  used  here  to  refer  to  any  illegal  leakage  of  information,  not  necessarily  a 
destructive  one... 
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A  first  attempt  to  do  it  in  the  context  of  OODBs  was  made  by  Samarati  et  al  [Samax97]. 
The  main  problem  with  the  model  in  [Samar97]  is  that  all  the  checks  are  done  at  Run-time 
which  increases  considerably  the  overhead  in  the  system.  Many  DBMSs  rely  on  protection 
which  is  checked  at  compile  time!  For  example,  Query  modification  in  Ingres  [Stone76]  or 
View-based  mechanisms  in  System  R  [GrifF76]  in  Relational  systems,  or  the  model  suggested 
by  Fernandez  et  al  [Fern94]  for  Object-oriented  databases.  In  this  paper  we  investigate  the 
problem  of  insuring  safe  information  flow  for  Object-Oriented  databases  by  performing  the 
checks  at  Compile  time  or  at  Rule-definition  time,  thus  saving  considerable  overhead  at  run¬ 
time.  A  very  important  assumption  of  the  present  paper  is  that  the  run-time  of  the  DBMS 
can  be  trusted.  That  it,  if  one  composes  its  transactions  only  from  Queries  and  Methods 
which  were  compiled  under  the  control  of  the  DBMS,  one  can  trust  their  object-code  and 
the  Run-time  system  which  executes  them.  A  similar  assumption  is  made  in  View-based 
systems  where  views  are  kept  after  they  are  compiled  and  optimized  [GrifF76]. 

In  a  recent  paper  by  the  same  authors  [GenGud97]  we  presented  a  simple  Transactions 
model  and  algorithms  to  check  for  information-flow  at  compile-time  for  this  transaction 
model.  The  limitations  of  the  transaction  model  in  [GenGud97]  was  that  only  the  basic 
READ/WRITE  methods  were  allowed,  and  no  general  methods.  In  this  paper  we  extend 
the  previous  transactions  model  by  allowing  transactions  to  invoke  any  method  (with  or 
without  parameters)  and  these  methods  may  further  invoke  other  methods.  We  put  very 
few  restrictions  on  the  type  of  programming  language  and  constructs  used  within  methods. 
Using  program-flow  analysis  techniques  [MuchnickSl],  we  are  able  to  analyze  the  methods  at 
compile-time  when  they  are  entered  into  the  system,  and  complete  the  analysis  at  the  time 
the  tr^sactions  is  compiled.  Therefore,  this  process  is  very  efficient  since  the  compile-time 
analysis  for  methods  is  done  only  once  (provided  the  method  was  not  changed). 

It  is  important  to  note  that  our  algorithms  provide  an  upper-bound  for  the  information- 
flow  problem.  Since  program-flow  analysis  methods  cannot  know  the  actual  Run-time 
control-flow,  they  must  consider  all  branches  of  an  If,  WHILE  or  CASE  statements.  Thus, 
when  our  algorithm  reports  on  safe  information-flow  the  safeness  is  assured,  but  when  it 
reports  on  an  unsafe  information  flow,  it  actually  reports  on  only  a  potential  unsafe  informa¬ 
tion  flow.  If  one  is  very  concerned  about  "false”  alarms,  one  can  employ  a  Run-time  method 
in  these  cases,  such  as  the  one  in  [Samar97].  Another  problem  in  [Samar97]  is  that  within 
a  Method  or  Transaction  all  the  information  associated  with  a  Read  query  is  added  into 
the  overall  run-time  flow,  regardless  of  whether  there  is  an  actual  flow  (between  program 
statements  )  of  this  information  into  the  objects  which  the  Method  writes.  This  problem  is 

overcome  in  our  model,  because  flows  within  program  statements  are  analyzed  within  every 
method. 

As  this  paper  relies  heavily  on  three  previous  papers  [Samar97],  [Fem94],  and  [GenGud97] 

,  these  papers  are  first  reviewed  briefly  in  Section  2  and  the  definition  of  safe  information  flow 
IS  given.  The  generalized  Transactions  and  Methods  model  is  defined  in  Section  3  and  the 
over^  approach  is  explained.  In  Section  4  we  present  our  compile-time  analysis  of  Methods, 
and  in  Section  5  we  present  the  Transactions  analysis  algorithm.  Examples  are  given  in  both 
sections.  Section  6  is  the  Summary. 

2  Background 

2.1  Fernandez  et.  al 

The  first  model  [Fern94],  assumes  a  simple  discretionary  Rules-based  authorization.  The 
model  deds  mainly  with  the  impact  of  inheritance  on  security  and  enforces  several  inheritance- 
based  policies.  In  following  papers,  policies  were  proposed  for  negative  authorization,  content- 
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dependent  restrictions,  and  for  resolving  conflicts  between  several  implied  authorizations  (see 
[LarrQO]).  Another  paper  extended  the  basic  model  to  include  treatment  of  general  methods 
[GalOz93]  (  As  an  example  to  the  inheritance  policies,  consider  the  database  in  Figure  1. 
A  rule  giving  a  user  Read  access  to  all  attributes  of  Student,  implies  also  a  Read  access  to 
Foreign  Student’s  Social  security  number  (SSN),  but  not  to  his/her  Visa...) 

Because  of  space  problems  we  wiU  not  review  this  paper  here.  The  most  relevant  point 
is  the  discussion  of  an  Access  Validation  algorithm  The  validation  algorithm  is  applied  at 
Compile-time  in  that  it  works  after  the  Query  translator  and  its  output  is  entered  to  the 
Optimizer  and  run-time  system.  The  Access- validation  algorithm  accepts  two  major  inputs: 

•  The  original  query  after  translation  in  form  of  a  tree.  This  query  is  further  extended 
using  the  inheritance  hierarchy  to  something  called  Authorization  Tree  (AT .yes),  (the 
AT.yes  will  be  redefined  in  the  next  section,  therefore  we  do  not  detail  its  structure 
here).  Initialy,  all  the  AT.yes’s  nodes  are  authorized.  After  the  validation  algorithm, 
the  AT.yes  contains  only  the  nodes  and  the  attributes  to  which  access  is  allowed  (see 
an  example  in  Section  3.1). 

•  The  rules  which  are  relevant  to  this  query  are  extracted  from  a  tree  called  the  Security 
Graph  which  is  an  extension  of  the  AT.yes  upwards  and  downwards  to  include  all 
relevant  rules. 

The  algorithm  scans  in  parallel  the  query  nodes  and  security  graph  nodes,  applies  the 
inheritance  policies  mentioned  above  and  produces  the  final  AT.yes  which  defines  the  allowed 
access. 

2.2  Samarati  et.  al 

The  second  model  by  Samarati  et  al.  [Samar97]  describes  a  run-time  architecture  (Message 
filter)  for  checking  for  information  flow.  Again,  for  reasons  of  space  we  cannot  describe  the 
model  in  detail.  The  most  important  concepts  are: 

•  Transaction.  A  transaction  is  the  set  of  methods  invocations  caused  by  a  user  sending 
a  message.  The  first  message  invokes  a  method  which  invokes  other  methods  by  sending 
messages  to  it  and  waiting  for  replies.  The  invoking  method  may  in  turn  wait  for  the 
reply  (synchronized)  or  deferred  its  waiting.  A  user  executing  a  transaction  is  called 
the  Transaction  initiator . 

•  Access  lists.  There  are  several  access  lists  associated  with  each  object  including 
RACL(o)  -  the  list  of  users  which  can  read  from  object  o, 

WACL(o)  -  the  list  of  users  which  can  write  into  object  o. 

•  Information  flow.  There  exists  a  flow  between  0,-  and  Oj  in  a  transaction  if  and 
only  if  a  write  or  create  method  is  executed  on  Oj,  and  that  method  had  received 
information  (via  forward  or  backward  transmission)  on  0,-.  When  a  method  A  sends  a 
message  to  another  method  B,  then  all  the  information  which  flowed  into  A  is  assumed 
to  flow  into  B.  Similarly,  if  a  method  A  receives  a  reply  from  B,  the  information  that 
flows  into  B  is  assumed  to  flow  into  A. 

•  Safe  Information  flow.  Information  flow  is  safe  only  if  there  is  information  flow  from 
0,'  to  Oj  and  ail  users  which  can  read  Oj  can  also  read  0,,  i.e.  RACL(Oj)  is  contciined 
in  RACL(0,). 

To  enforce  only  safe  information  flows,  [Samar97]  suggests  the  construction  of  a  Mes¬ 
sage  Filter  component  which  intercepts  each  and  every  message  in  the  system.  Using 
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this  information  it  is  possible  to  enforce  safe  information  flow  and  disallow  transfer¬ 
ring  of  information  which  may  cause  an  unsafe  flow  (i.e  an  empty  reply  is  returned  in 
that  case...) 

Although  the  above  algorithm  is  very  general  and  works  for  various  types  of  methods 
and  executions,  it  requires  the  check  and  filtering  of  every  message  in  the  system.  This  is  a 
considerable  overhead!  In  the  next  sub-section,  we  discuss  a  simpler  model,  of  Read/Write 
methods  only,  but  on  which  a  compile-time  algorithm  based  on  [Fern94]  is  used. 

2.3  Gendler  &  Gudes  Model 

The  model  by  Gendler  Sc  Gudes  [GenGud97]  provides  compile-time  checking  for  information- 
flow  which  is  based  on  the  AT.yes  idea  of  [Fern94].  The  following  concepts  are  used: 

•  Object  Model  and  Authorization  Model.  Both  models  are  similar  to  the  ones  in 
[Fem94] 

•  Access  Lists.  In  [Fern94]  the  main  administration  structure  was  the  authorization 
rule  placed  at  special  class/node  in  the  object-hierarchy  tree.  For  purposes  of  flow 
control  we  need  to  define  also  for  each  attribute  a  list  of  all  users  authorized  to  read 
it.  We  maintain  the  structure  called  read  access  /»st(RACL)  containing  the  list  of  users 
who  have  read  privileges  to  the  attribute.  The  RACL  of-course  can  be  obtained  using 
the  inheritance  policies  mentioned  above: 

ItACL(C).Attr)  =  {«  :  (3  0'\0  ■<  O'  and  3  rule  {u,R,0'.Attr)) 

A  {JB  0"\0  <  0"  X  O'  and  3  rule  (u,  -R,  0".Attr))} 

i.e.  this  list  contains  the  users  that  the  authorization  rules  permit  them  a  read  access 
to  the  attribute,  either  explicitly  or  via  the  inheritance  policies  specified  above. 

•  Authorization  Tree.  Each  query  of  the  type  above  is  validated  against  the  initiator 
(C^)  authorization  rules  using  the  model  and  algorithm  presented  in  [Fern94].  The 
result  of  such  validation  is  the  set  of  objects  (classes)  and  their  attributes  which  is 
authorized  for  this  query.  Basically,  this  set  is  a  sub-tree  of  the  query  graph  rooted  at 
O.Attr  and  is  called  authorization  tree,  denoted  AT_yes(u,  A,  O.Artr).  In  the  sequel 
we  will  usually  use  the  authorization-trees  for  Read  access,  and  therefore  denote  them 
as:  AT_yes(u,  O.Attr). 

•  User  Access  Tree  (UAT).  The  set  of  attributes  in  the  "entire  database”  that  user 
u  is  allowed  to  access  for  reading  is  called  user  access  tree. 

UAT(«)  =  {(O.Attr) :  u  e  RACL(O.Attr)} 

The  above  UAT  is  computed  from  the  data  structure  RACL,  but,  obviously,  it  is  also 
true  that 

UAT  =  ^'ii^jAT  jyes{Oi.Attr j) 

•  Common  User  Access  Tree(CUAT).  We  introduce  a  new  measure  for  every  at¬ 
tribute  Aj-:  the  intersection  of  UATs  of  aU  users  who  are  permitted  to  read  it.  This 
intersection  expresses  the  set  of  all  attributes  which  is  allowed  to  be  read  by  all  users 
who  are  allowed  to  read  attribute  Aj. 
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Figure  1:  University  databctse 


CUAT(O.A«r)  =  fl  UAT(ti) 

Vu6RACL(0.y4«r) 

As  will  be  shown  below,  the  CUAT  is  a  critical  component  in  computing  safe  informa¬ 
tion  flow.  Detailed  algorithms  for  its  computation  were  given  in  [GenGud97]  . 

•  Safe  Information  Flow.  We  are  now  ready  to  define  the  criteria  for  safe  information 
flow.  Intuitively,  we  know  that  every  read  query  after  validation  can  only  read  the 
objects  and  attributes  contained  in  that’s  query  authorization  tree.  Therefore,  the 
union  of  these  trees  expresses  aU  the  information  to  which  this  transaction  has  read 
access.  We  must  make  sure  that  the  users  who  have  access  to  the  object  into  which 
this  transaction  writes,  are  allowed  to  access  all  the  information  that  has  flowed  into 
the  transaction  upto  the  Write  query. 

Theorem  1  (Safe  Information  Flow)  The  information  flow  to  the  attribute  Ok-AUrj 
caused  by  the  write  access  write(Ok-Attrj,v)  in  a  transaction,  is  safe,  if  and  only  if,  the 
Common  Users  Access  Tree  of  the  attribute  Ok-AUrj  contains  the  union  of  the  authorization 
trees  of  all  the  previous  read  queries. 

(J  AT_yes(t)  C  CUAT(Oit.Attrj)  ■«=>•  the  information  flow  to  Ok-Attrj  is  safe. 

«=i 

Proof  see  [GenGud97].  Intuitively,  each  object  read  by  the  transaction  and  potentially 
written  into  Ok-Attrj,  must  be  contained  in  the  set  of  objects  allowed  Read  access  by  all 
users  who  can  read  object  Ok-Atttj. 

3  The  generalized  Transactions  &  Methods  Model 

The  transactions  model  presented  in  this  section,  is  a  generalization  of  the  model  in  [GenGud97] 
discussed  in  above,  in  that  a  transaction  may  now  call  methods  and  pass  to  them  parame¬ 
ters.  Such  a  method  may  further  call  other  methods  and  in  return  may  return  a  value  or  an 
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object  to  the  calling  method  or  transaction.  Both  the  transaction  and  the  method  it  calls 
may  issue  Reads  and  Writes  to  the  database,  therefore,  information  flow  may  occur  within 
methods  and  it  must  be  checked  for.  Formally,  we  deflne: 

Transaction.  A  transaction  consists  of  a  sequence  of  Read  or  Write  queries,  or  Method 
calls,  where: 

Read  query,  val  =  read{0 .Attr)  where  O  is  a  database  object/class  ,  Attr  is  an  attribute 
and  val  is  the  variable  that  receives  the  result. 

Write  query.  write{0. Attr,  val)  where  0  and  Attr  are  as  before  and  val  is  the  value  (or 
variable)  to  be  written  into  the  object  attribute. 

Method  with  return  value,  val  =  meth{pi, . .  .p^)  where  meth  is  the  name  of  a  called 
method,  p\...pn  Ju-e  the  method’s  parameters  (in  a  form  of  single  instances  or  AT.yes 
hierarchical  trees)  and  val  is  the  variable  that  receives  the  return-value. 

We  also  assume  without  loss  of  generality  that  a  transaction  does  not  contain  any  control- 
flow  statements  (they  can  be  embedded  within  a  method)  and  therefore  contain  only  rails  to 
the  methods  of  the  above  type.  As  explained  in  the  introduction,  the  analysis  of  transactions 
is  done  in  two  phases: 

1.  The  Method’s  compile-time  phase 

2.  The  Transaction’s  compile-time  phase. 

Each  method  is  analyzed  separately  at  the  time  it  is  compiled  and  stored  in  the  database, 
and  the  analysis  results  are  stored  in  specialized  security-related  tables.  The  transaction 
analysis  phase  uses  the  information  in  these  tables  at  the  time  the  transaction  is  compiled. 

3.1  Analysis  of  methods 

In  this  section  we  discuss  the  compile-time  analysis  of  a  single  method.  This  analysis  is  done 
once  when  the  method’s  code  is  inserted  into  the  database,  and  the  results  are  stored  in 
several  tables  associated  with  the  method.  The  results  of  this  analysis  are  composed  of  three 
parts: 

1.  The  information  flow  to  objects  accessed  for  writing  from  within  the  current  method. 

2.  The  information  flow  to  the  return  value  of  the  method  (if  one  exists). 

3.  The  forward  information  flow  via  the  parameters  to  methods  which  are  called  from 
within  the  current  method. 

These  results  are  stored  in  special  database  tables  and  are  used  in  the  next  stages  of 
the  analysis.  Since  the  actual  parameters  to  the  method  are  not  known,  we  use  "virtual” 
symbols  in  the  tables.  At  transaction  analysis  time  the  real  information  flow  is  substituted 
for  these  virtual  symbols. 

The  method  analysis  itself  uses  program-flow  analysis  techniques  which  are  common  in 
Compiler  and  program  Optimization  ([Aho86,  MuchnickSl,  Denn86]).  We  assume  that  we 
have  a  parser  that  can  generate  a  syntax  tree  of  the  method,  and  the  flow-analysis  phase 
operates  on  this  tree.  We  assume  that  the  method  is  written  in  a  programming  language  like 
Pascal  or  C  (C-1-1-)  with  some  restrictions.  We  assume  that  all  variables  in  the  methods  are 
strongly-typed,  i.e.  pointers  and  memory  management  operators  are  either  strongly  typed 
too  or  forbidden.  Another  limitation  is  that  there  are  no  static  or  global  variables,  i.e  all 
information  between  methods  is  passed  via  the  parameters,  and  the  goto  statement  is  also 
forbidden.  We  also  use  the  following  notation: 
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1.  FLOW(a),  where  o  is  an  OODB  object  or  an  attribute  of  an  object  or  a  local  variable 
of  method  -  a  list  6  of  OODB  objects,  local  variables  and  virtual  symbols,  such  that 
there  is  a  potential  information  flow  to  a,  f*  a. 

2.  IN  stores  the  potential  flow  to  the  current  block  of  statements  ([MuchnickSl]).  ^ 

3.2  Assignment  statement 

Let  us  begin  with  the  simple  case  of  assignment  statement: 

5  id  =  £ 

Assignment  is  the  simplest  way  for  information  flow.  AU  information  from  expression  E 
flows  into  variable  id.  Note  that  syntax  analysis  of  the  expression  E  is  required  to  And  all 
variables  (or  functions)  participating  in  the  expression.  We  use  the  notation  a  ^  Eto  specify 
the  variables  involved  in  the  expression  E. 

FLOW(id)=  U{{a}uFLOW(o)}UlN  (3.1) 

We  must  include  IN,  if  this  statement  is  part  of  an  IF  or  a  LOOP  block  as  explained 
below. 


3.3  Block  of  statements 

We  define  the  relation  before  between  two  statements  as  follows:  5i  before  S2  =>•  if  5i 
leads  to  an  informati  on  flow  FLOW51  to  variable  x  and  S2  leads  to  an  information  flow 
X  y  such  that:  FLOW51  y.  Formally,  Si  before  S2  =►  if  3  FLOW(a:)  updated  in 
Si  -  FLOW5J  and  3  flow  x  y  m  S2,  such  that  FLOW51  y  exists,  and  statement  Si 
is  executed  before  52-  Thus,  the  statement  Si  must  be  analyzed  before  52.  Consider  the 
following  statements  block: 


5  — *•  5i;  52;...  Sn 

Si  before  S2  before  ...  before  Sn  (3.2) 

We  say  that  inside  a  block  of  statements  there  are  sequential  flows.  The  analysis  of  a 
block  of  statements  must  therefore  pass  sequentially  thorough  all  the  statements  in  the  block. 
We  shall  see  in  following  subsections  another  order  of  flows  for  loop  statements. 

3.4  Read/ Write  queries 

A  method  may  contain  Read  queries  and  Write  queries.  Such  queries  will  cause  information 
flow  as  follows.  Assume  a  Read  query  :  a  =  read{O.Attr)  then 

FLOW(a)  =  O.Attr  U  FLOW(O.Attr)  U  IN  (3.3) 

We  say  that  as  a  consequence  of  a  read  query  the  information  flows  to  a  from  O.Attr 
as  well  as  from  all  objects  and  variables  that  the  information  flowed  from  them  to  O.Attr 
before  the  Read.  For  a  Write  query:  write(O.Attr,a)  then 

FLOW(O.A«r)  =  a  U  FLOW(a)  U  IN  (3.4) 

*we  will  use  in  the  rest  of  the  paper  the  term  "information  flow”,  although  it  should  be  dear  that  we  mean 
essentially  only  "potential  information  flow” 
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The  information  flows  to  O.Attr  from  a,  as  well  as  from  all  objects  and  variables  that  the 
information  flowed  from  them  to  a  and  to  the  Write  statement. 

Now,  for  every  Write  query  we  must  record  the  flow  that  was  caused  by  the  Write  in  a 
table  entry,  since  later  on  we  will  analyze  whether  that  flow  was  safe  or  not.  Each  compiled 
method  will  have  its  own  WRITES  table  containing  one  entry  for  every  Write  query  in  that 
method.  An  entry  in  the  WRITES  table  has  the  format:  {Obj,  WrlnFlow),  where  Obj 
is  OODB  object  accessed  for  writing  and  WrlnFlow  is  set  of  OODB  objects  and  virtual 
symbols  that  is  contained  in  FLOW(a)  (Note  that  locctl  variables  are  not  inserted  into  this 
table). 

3.5  If  statement 

Consider  the  following  example: 
if  (a  >  1) 

6=1; 

else  6  =  2; 

There  is  no  direct  information  flow  from  o  to  6  in  this  example.  However,  it  is  possible 
after  the  execution  of  the  code  to  draw  a  conclusion  from  the  value  of  6  about  the  value  of 
a.  So  there  is  a  potential  information  flow  from  the  if  condition  to  both  the  then  and  the 
else  branches,  (see  [Denn86]  for  an  extensive  discussion  of  this  example).  Formally, 

S  ^  if(i;)  5i;  else  S2; 

IN(5i)  =  IN(52)  =  U  {{a}  U  FLOW(a)}  U  IN  (3.5) 


3.6  Loop  Statement 

Loop  statement  is  a  more  complicated  case  than  if  statement.  Consider  the  following 
example: 

X  =  1;  0  =  1; 

for(i  =  0;  i  <  n;  i  +  +) 

X  —  X  ^  O 

Again,  there  is  no  direct  flow  from  n  to  x,  but  after  the  loop  execution,  x  is  equal  to  n! 
.  Another  problem  is  the  repeating  execution  of  the  loop  body: 
while  (...) 

{0  =  6; 

6  =  c; 

...  } 

One-pass  analysis  finds  the  flows  6  o  ,  c  6  .  If  the  loop  body  is  executed  more  than 
once,  the  flow  c  o  also  exists.  In  the  previous  analysis  of  a  statements  block,  we  only 
considered  sequential  flows.  Loop  statements  have  the  property  of  cyclic  flow,  i.e  succeeding 
statements  also  affect  preceding  ones.  Thus,  for  loops  a  two  pass  analysis  is  needed.  Formally, 

S  ^  while(F;)  Si 

IN(5i)  =  [J  {{0}  U  FLOW(o)}  U  IN,analyze_twice5i  (3,6) 

aeE 

where  analyze.twice  is  a  command  to  the  Analyzer  to  scan  the  statement  twice.  A  similar 
analysis  is  done  for  a  for  statement. 


95 


z 

t 

~r 

{.2:,-$!} 

Table  1:  Result  of  one  pass  analysis  of  loop 
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Table  2:  Result  FLOW  table 


3.7  Example 

Let  us  consider  the  following  example  of  a  partial  method’s  code,  and  analyze  the  flows 
occuring  within  it. 

Copy(int  x)  {  int  z,t; 

^  =  0;  t  =  1; 

while  (t  ==  1) 

{z  =  z  +  l; 
if  (z  ==  a:) 

t  =  o-,  }  } 

At  the  beginning  the  flows  to  local  variables  z  and  t  are  equal  to  0.  The  result  FLOW 
of  the  first  pass  of  the  analysis  of  the  while  statement  is  shown  in  the  Table  1.  The  flow 
IN  of  the  loop  body  is  variable  t,  then  analysis  of  the  first  assignment  statements  finds 
FLOW(z)={t}.  The  analysis  of  the  if  statement  finds  FLOW(t)={z,_$l}. 

The  complete  FLOW  table  of  the  method  is  shown  in  Table  2.  The  second  pass  analysis 
of  the  while  statement,  finds  the  flow  _$!'>.»  z.  It  is  essential  because  during  execution  of 
the  loop  the  value  of  the  parameter  x  is  copied  into  the  local  variable  z. 

3.8  Method  Calls 

In  this  section  we  discuss  the  analysis  of  methods  calls.  Since  every  method  is  analyzed 
independently,  separate  tables  are  created  for  each  method.  One  table,  called  the  CALLS 
table  is  used  to  store  the  flow  into  the  called  method  parameters.  The  other  table,  called 
the  RETURN  table,  contains  the  flow  returned  from  a  method  call. 

The  CALLS  table  contains  an  entry  for  each  method  called  from  within  the  current 
method.  Generally  when  a  method  A  calls  another  method  B,  information  may  flow  in  both 
directions:  from  A  to  B  via  the  Input  or  Input/Output  parameters  (note,  we  forbid  the  use 
of  global  variables...),  and  from  R  to  A  via  the  Output  or  Input/Output  parameters  or  via 
the  Return  value.  In  this  paper  we  restrict  the  discussion  to  Input  parameters  and  Return 
value  only,  the  case  of  Output  and  Input /Output  parameters  is  discussed  in  [Gendler97]. 

Basically,  an  entry  in  the  CALLS  table  is  {method,  ParInFlowi . . .  ParInFloWn),  where 
method  is  the  name  of  the  called  method  and  Pars  are  sets  of  objects  and  virtual  symbols 
that  contain  the  information  flow  into  the  method's  parameters.  K  the  called  method  returns 
a  value  via  the  Return  statement,  then  this  value  is  denoted  also  as  a  virtual  symbol.  Virtual 
symbols  are  used  to  denote  information  flow  sets  which  are  not  known  at  Method  compilation 
time  and  are  instantiated  only  at  Transaction  compilation  time.  There  are  two  kinds  of 
virtual  symbols:  _$i  denotes  information  flow  into  parameter  number  i  and  -@j  stands  for 
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Figure  2:  Analysis  of  transaction 

information  flow  from  the  called  method  (via  return  value)  and  corresponds  to  entry  number 
j  in  the  CALLS  table. 

Assume  now  a  transaction  that  invokes  the  method  A.  As  the  transaction  is  being  com¬ 
piled,  the  actual  parameters  passed  to  the  methods  axe  known.  Let  assume  that  real  objects 
Ol  and  02,  stand  for  the  parameters  of  A  and  therefore,  the  virtual  symbols  .$i  are  binded  to 
them.  When  B  is  processed,  flows  to  the  actual  parameters  substitute  the  virtual  symbols 
_$i  of  B.  The  union  of  flows  to  all  return  values  of  B  substitute  the  virtual  symbol  _@ra  of  A. 
Figure  2  demonstrates  this  process  (see  a  full  discussion  in  Section  5). 

We  now  specify  precisely  the  information  that  is  entered  into  the  tables.  The  first  case 
is  when  the  method  is  called  within  an  arithmetic  or  another  expression  (i.e  a  function). 

b  =  E  -*  method{Ei,E2 . . . En) 

The  new  entry  in  the  CALLS  table  is: 

(method,  GlFlow(  [j  {{a}  U  FLOW(a)}) . .  .GlFlow(  (J  {{a}  U  FLOW(a)}  U  IN))  (3.7) 

“6^1  aQEn 

where  GlFlow(a)  is  the  set  FLOW(a)  without  local  variables.  To  store  the  return  value, 
e.g.  within  the  flow  of  the  expression  E  (which  flows  into  6  ),  the  virtual  symbol  is  used! 
A  similar  case  is  a  call  of  the  method  which  returns  nothing. 

3.8.1  Return  value 

The  return  value  provides  the  means  for  backward  information  flow  from  the  called  method, 
to  the  method  or  the  transaction  which  has  called  it. 
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S  -*  return(J5') 

An  entry  in  the  the  RETURN  table  is  inserted  for  each  Return  statement  and  it  contains 
the  following  information  flow 

GlFlow(  (J  {{a}  U  FLOW(a)}  U  IN)  (3.8) 

aeE 

Note  that  since  there  may  be  multiple  Return  statements,  several  entries  will  be  created, 
however,  since  we  do  not  know  which  Return  will  be  taken  and  we  are  interested  in  potential 
flow,  these  entries  will  be  merged  into  a  single  entry  at  the  Transaction  analysis  time. 

4  Example 

Consider  the  university  database  shown  in  Figure  1.  Let  us  suppose  that  the  salary  of 
the  university  president  is  the  most  secret  piece  of  information  in  the  database.  It  is  also 
reasonable  to  assume  that  the  president  of  the  university  has  the  greatest  salary  between  all 
university  employees.  We  will  investigate  an  example  of  illegal  information  flow  caused  by 
the  called  methods,  in  consequence  of  which  the  president’s  salary  is  compromised. 

The  methods  code  is  shown  below.  In  this  code  we  have  not  used  any  special  query 
language  for  read/write  queries.  Inside  a  method  we  just  use  the  statement  readjobject(o.A) 
where  o  is  a  databcise  object  and  A  is  an  attribute.  It  is  important  to  note  that  at  Method 
analysis  time,  since  in  a  statement  such  as  readjohjeci{o.A)  the  specific  instance  of  o  is  not 
known,  we  use  instead  the  root  Class  O  of  that  object.  In  actuality,  when  the  transaction 
ansilysis  is  performed,  this  root  class  is  replaced  with  the  restricted  set  of  sub-classes  and 
objects  allowed  access  to  the  transaction’s  initiator  (i.e  the  ATjyes  subtree).  In  the  tables, 
though  we  denote  it  as  the  root  class,  or  as  a  virtual  symbol. 

The  example  below  shows  two  methods,  one  calling  the  other.  For  each  method  we  show 
the  important  tables  generated  by  the  method  analysis  phase. 

Max_PayedJ;mployee(Employee_Hierarchy  empJree) 

{  int  Max  =  0; 
int  ESSN,ESalary; 

Employee  cmp; 
for  Vemp  €  empJree 
{  ESalary  =  readjobject{emp.S alary)-, 

ESSN  =  read  job ject{emp.SSN)-, 
if  {ESalary  >  Max) 

Max  =  StoreJlesults(jE55Ar,  iJ5a/arj/) 

} 

return  Max;  } 

int  StoreJlesults(int  resl,int  res2) 

{  Dummy  dummy; 
write{dummy.val  1 ,  rcsl ) ; 
wrUe{dummy.val2,  res2); 
return  res2;  } 

First,  as  a  result  of  the  analysis  of  the  method  Max-Payed  Jlmployee,  the  table 
Max_Payed_Employee.FLOW  is  generated  as  shown  in  Table  3.  The  first  row  shows  the 
fact  that  local  object  emp  contains  the  flow  received  via  the  parameter.  The  next  two  rows 
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emp 

-$i 

ESalary 

{_$1. Salary, emp.Salary  } 

ESSN 

{  -$l.SSN,emp.SSN  } 

Max 

{  ESalary  ,  .Sl.Salary,  emp.Salary,.®!} 

Table  3:  Max_Payed_Employee.FLOW 


StoreJlesults  |  {  Jl.SSN,  Jl.Salary}  ,  {  _$l.Salar)n~ 


Table  4:  Max_Payed_Einployee.CALLS 


represent  the  flows  caused  by  the  read  queries:  The  fourth  row  shows  the  information  flow 
to  local  variable  Max  via  the  statement 

Max  —  Store_Results(i?55iV,£?5a/ory)  (4.1) 

This  is  the  first  case  of  a  method-call  discussed  above.  It  is  composed  from  the  flow 
from  the  if  statement  and  the  flow  returned  by  the  called  Method.  The  virtual  sirmbol  _®1 
contains  the  flow  returned  from  the  StoreJlesults  method.  The  flow  into  Max  contains 
the  IN  flow  and  the  flow  resulted  from  the  expression  if  {ESalary  >  Max),  so  all  the 
information  from  the  conditional  expression  flows  to  Max.  It  is  ESalary  and  its  FLOW 

which  is  {_$l.Salary,e77ip.Salary  )•.  That  explains  the  three  components  of  the  entry  for  Max 
in  the  table. 

Now,  the  calling  of  method  Store_Results  results  with  a  new  entry  in  table  CALLS 
^  shown  in  Table  4.  The  entry  contains  the  name  of  the  called  method  and  its  forward 
information  flow  -  i.e.  the  information  flowing  into  the  parameters  -  ESSN  and  Esalary 
plus  information  flow  to  the  statement  (flow  IN),  recall  that  the  statement  is  part  of  an  if 
structure,  so  as  seen  above,  {  .Sl.Salary}  is  added  to  the  flow  of  each  of  the  parameters. 

Now  let  us  see  the  analysis  of  the  method  StoreJlesults.  The  FLOW  table  is  empty  and  is 
not  interesting.  The  WRITES  table  contains  two  entries  as  shown  in  Table  5.  The  RETURN 
table  contains  just  the  entry  (-$8).  In  the  next  section  we  shall  see  how  the  information 
contained  in  the  above  tables  is  combined  for  the  detection  of  non-safe  information  flow. 

5  Analysis  of  transactions 

The  analysis  of  a  transaction  is  the  final  stage  of  the  analysis  described  in  this  paper.  Trans¬ 
actions  may  contain  queries  to  the  database  -  i.e.  Read/Write  queries,  and  calls  to  various 
methods.  We  assume  that  all  components  of  a  transaction  are  executed  sequentially  -  i.e.  no 
control  flow  statements  are  allowed,  (this  is  not  a  real  limitation  since  analysis  similaT  to  the 
one  described  for  methods  can  be  done,  or  alternatively  they  can  be  inserted  within  another 


Dummy.vall 

{.$1} 

Dummy,val2 

{-$2} 

Table  5:  StoreJlesults. WRITES 
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method.  )  In  terms  of  Access  Control,  we  assume  the  authorization  model  for  methods  in 
which  a  method  corresponds  to  an  Access  Type,  i.e  internal  actions  of  the  method  do  not 
require  additional  authorization  (see  [GalOz93]  ). 

The  transaction  analysis  uses  the  auxiliary  results  of  the  methods  analysis  -  i.e  the  tables 
WRITES,  CALLS  and  RETURN  discussed  above.  We  add  a  new  table  called  TR_FLOW 
to  store  all  information  flows  caused  by  the  transaction  at  any  point  of  time.  The  table 
TR_FLOW  will  be  used  for  computing  the  actual  flows  to  the  parameters  of  the  methods. 
Note,  that  information  flow  has  the  property  of  transitivity:  if  flows  a  b  and  b  ^  c  are 
safe,  then  the  flow  a  c  is  safe  too.  This  means  that  to  verify  safety  of  information  flow 
oi  02  caused  by  writing  within  a  method,  there  is  no  need  to  record  all  flows  o,-  oi 
caused  by  previous  write  queries  within  the  transaction  and  which  were  analyzed  to  be  safe. 
In  checking  for  safeness  we  use  the  ideas  discussed  in  Section  2.3.  That  is,  for  Read  Queries 
we  first  obtain  by  the  authorization  algorithm  the  subtree  AT.yes  and  use  it  as  the  actual 
flow.  The  actual  information  flow  is  computed  by  binding  the  flow  from  the  transaction  with 
the  virtual  symbols  recorded  within  the  methods  analysis  tables.  The  analysis  algorithm 
TranFlowControl  is  as  follows: 

TranFlowControl(transacrion,tn*ri'ator) 
for  V  method  meth  invoked  by  the  transaction 
switch  {meth) 

case  read  query  a  —  read{0  .Attvi) 

Generate  AT.yes  using  Initiator  privileges 
TRJ’LOW(a)  =  TRJ’LOW(a)  U  AT.yes{query) 
break; 

case  write  query  write{O.Attrj,a) 

Generate  AT.yes  using  Initiator  privileges 
if  not  Safe(  TKJFLOW{a),  AT jyes{O.Attrj),  initiator) 
return  FALSE 
break; 

case  method  a  =  meth{p\ . .  .pn) 
if  initiator  ^  RACL(meth) 
break;  /*  no  need  to  check  flow  */ 
if  not  MethFlowControl(met/i,  TR  J’LOW(pi) . .  .TRJLOWCpn)) 
return  FALSE 

TR^LOWCa)  =TRJ’LOW(a)  U  (U  meth.RETURN) 
break; 

return  TRUE 

The  MethFlowControl  algorithm  works  as  follows:  first,  it  substitutes  the  virtual ' 
symbols  with  real  values,  then  it  verifies  all  write  queries  within  the  method  and  checks  for 
safe  information  flow.  To  verify  the  consistency  of  information  flow  in  all  invoked  methods  the 
algorithm  works  in  a  recursive  manner.  The  decision  about  safety  of  a  particular  information 
flow  is  made  during  the  process  of  the  WRITES  table  binding: 

MethFlowControl(m,  fli .  ..fin) 
bindiJl,  fh) 

binc^Jn,  fin) 

for  V  entry  i  in  CALLS  table  {meth,  Pari  •  •  •  Porn) 
if  not  MethFlowControl(meth,  Pari . . .  Parn) 
return  FALSE 
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bind(.@i,  U  me/A. RETURN)  /*all  entries  taken  */ 
for  V  entry  in  WRITES  table  {Obj,  WrFlow) 
if  not  Safe(Wr/’/o«?,Obj) 
return  FALSE 
return  TRUE 

Note  that  the  bind  of  Return  value  is  possible,  since  it  is  known  when  we  return  from  the 
recursion. 

The  Safe  algorithms  checks  whether  a  given  information  flow  is  safe  using  the  same  algo¬ 
rithm  as  described  in  Section  2.3  (and  in  details  in  [GenGud97]  ).  Remember,  CUAT{obj) 
stands  of  the  intersection  of  all  objects  which  can  be  read  by  users  who  can  also  read  object 
obj. 

Safe(//ou;,  obj) 
if  Contain(//ou;,CUAT(o6j)) 
return  TRUE; 
else 

return  FALSE; 

The  proof  of  correctness  for  the  above  algorithms  is  quite  obvious  and  is  given  in  [Gendler97]. 
5.1  Example  -  continued 

Let  us  illustrate  the  execution  of  the  algorithm  by  the  example  of  the  method 
Max_Payed_Employee  discussed  in  the  previous  section.  Assume  the  following  transaction 
which  invokes  the  method  Max_Payed_Employee. 
begin  transaction 

1. create  Private 

2. empJree  =  read{Employee.{S  alary,  SSN}) 

3. a  =MaxJ*ayed  Jlmployee(emp_/ree) 

4. write{Private,  a) 
end  transaction 

The  object  Private  is  the  private  object  created  by  the  initiator  of  the  transaction.  The 
AT.yes  result  of  the  read  query  is  returned  to  empJree.  This  tree  is  a  sub-tree  of  the  Em¬ 
ployee  root  class,  and  includes  the  objects  permitted  to  the  initiator  of  the  transaction.  (Note, 
diflferent  sub-trees  may  be  authorized  for  different  users).  The  method  Max_Payed_Employee 
receives  the  tree  as  a  parameter  and  returns  the  maximal  payed  employee  within  it.  At  the 
first  glance  the  transaction  seems  legal.  Even  if  the  method  will  return  the  salary  of  the 
president  (assuming  the  initiator  has  access  to  it),  the  initiator  will  write  it  to  his  private 
object  which  nobody  has  access  to,  and  no  non-safe  information  flow  will  occur.  How¬ 
ever,  to  see  the  whole  picture  we  must  analyze  the  information  flows  caused  by  the  method 
Max_Payed JEmployee.  Applying  the  algorithm  TransFlowControl  we  get  the  following 
results  (refer  to  the  numbered  items  in  the  transaction): 

1.  No  flow  recorded  yet. 

2.  TR_FLOW(empJree)=AT_yes(Employee.{Salary,SSN},imtmtor) 

3.  The  flow  from  the  method  is  computed  by  calling: 

MethFlowControl(Max_Payed_Employee,TRJFLOW(empJrce)). 
and  binding  the  parameters  in  the  table  Max_Payed_Employee.FLOW: 

6:nd(_Si,  TR_FLOW(cmpJree)) 

Next,  the  following  recursive  call  is  made: 
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MethFlowControl(  StoreJlesults,  {TR_FLOW(empJree).SSN, 
TR_FL0W(e7npJree).Salary},  {TRJ'LOW(€mpJree).Salary}) 

The  substitution  of  real  data  inside  the  virtual  symbols  is  performed: 

6md(.$f,  {TRJ'L0W(e7npJrce).SSN,  TRJ'LOW(empJrce).Salary}) 

6md(_$2,  {TR_FLOW(empJree). Salary  }) 

Now  we  have  to  analyze  the  WRITES  table  of  this  method.  The  decision  about  the 
safety  of  the  information  flow  is  made  by  the  two  calls  to  the  algorithm  Safe: 

Safe({TR_FLOW(empJree).SSN,  TRJPLOW(empJree).Salary},  Dummy.vall) 

Safe({TR_FL0W(c7npJrce).Salary},  Dummy. val2) 

Clearly,  the  information  flows: 

{TR_FLOW(cmpJree).SSN,  TR_FLOW(cmpJrcc).Salary}  ^  Dummy.vall  and 
{  TR_FLOW(cTnpJrec).Salary  }  Dummy. val2  are  detected. 

4.  The  flow  into  the  object  private  is  computed,  but  since  this  is  a  private  object,  its 
CUAT  is  at  least  AT.yes,  therefore  this  write  is  obviously  safe. 

The  safeness  of  this  transaction  is  therefore  dependent  on  the  safeness  of  step  3.  Let 
us  imagine  that  Dummy  is  a  public  object  open  to  all  users  (i.e.  we  can  assume  that 
CUAT(Dummy)  may  contain  very  few  objects,  maybe  the  salary  of  the  students  employees 
only).  So  the  safety  of  the  information  flow  depends  on  the  privileges  of  the  transaction 
initiator  u.  If  the  transaction  initiator  is  a  user  allowed  to  read  students’  salaries  only,  i.e. 

TR-FLOW„(empJrcc)  =  Student. {SSN,Salary} 

then  there  is  no  non-safe  information  flow.  But,  if  the  transaction  initiator  is  a  user  allowed 
to  read  the  president’s,  i.e. 

TR_FLOW„(empJree)  =  {Manager,President}.{SSN,Salary} 

then  the  method  Max_Payed_Employee  plays  the  role  of  a  Trojan  Horse  [Samar97].  Clearly, 
this  illegal  flow  is  discovered,  since  TR-FLOW  is  not  contained  in  CXJAT(Dummy). 

5.2  Authorization  Rules  Maintenance 

As  was  explained  in  Section  2.3  and  used  in  the  Safe  algorithm,  the  main  data  structure  with 
which  the  flow  computed  at  compile- time  is  checked,  is  the  structure:  CUAT  -  the  common 
user  access  tree.  This  structure  can  be  uniquely  determined  for  a  given  set  of  authorization 
rules.  However,  when  authorization  rules  change,  this  structure  has  to  be  recomputed. 

There  are  basically  two  approaches.  One  is  to  recompute  it  every  time  a  new  transaction 
is  compiled.  This  carries  heavy  computational  overhead  but  saves  storage.  The  other  option 
is  to  compute  it  once  and  stores  it  for  each  object  (class),  and  maintain  it  when  authorization 
rules  are  added  or  deleted.  Since  the  events  of  changes  in  authorization  rules  are  much  less 
frequent,  this  is  much  better  computationally.  Algorithms  to  maintain  the  CUAT  structure 
were  also  presented  in  [GenGud97]  .  A  similar  approach  is  advocated  by  [Bertino(96)]. 

6  Conclusions 

In  this  paper  we  discussed  the  problem  of  protecting  against  unsafe  information  flow  in 
object-oriented  databases.  Previous  models  for  such  protection  provide  a  run-time  mecha¬ 
nism  which  carries  a  considerable  run-time  overhead.  This  paper  presents  a  compile-time 
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solution  to  the  problem.  It  relies  on  the  compile-time  anailysis  of  methods  which  uses  well 
known  program  analysis  techniques.  It  then  uses  the  Methods  analysis  phase  in  analyzing 
the  transactions  and  deciding  at  transaction  compile-time  whether  an  unsafe  information 
flow  exists  or  not.  In  the  future  we  like  to  consider  the  case  of  distributed  transactions  (or 
autonomous  objects)  when  no  single-centralized  control  is  available  for  the  analysis  phase. 
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Abstract 

Access  control  is  the  cornerstone  of  information  se¬ 
curity  and  integrity,  but  the  semantic  diversity  of  ob¬ 
ject  models  makes  it  difficult  to  provide  a  common 
foundation  for  access  control  in  object-oriented  sys¬ 
tems.  This  paper  presents  a  primitive  capability-based 
access  control  architecture  that  can  model  a  variety  of 
authorization  policies  for  object-oriented  systems. 

The  architecture  described  is  integrated  at  the  meta¬ 
object  level  of  the  Meta- Object  Operating  System  Envi¬ 
ronment  framework,  providing  a  common  foundation 
for  access  control  in  heterogeneous  object  models. 

1  Introduction 

Access  control  is  of  critical  importance  to  the  se¬ 
curity  and  integrity  of  multiuser  distributed  systems, 
from  distributed  databases,  local  networks  and  in¬ 
tranets  to  the  Internet  itself.  While  object-oriented 
technology  has  become  a  touchstone  for  developing 
modern  distributed  systems,  the  full  potential  of  ac¬ 
cess  control  mechanisms  for  objects  has  yet  to  be  real¬ 
ized.  Most  of  the  advances  and  successful  applications 
of  object  access  control  have  been  confined  to  the  area 
of  database  security  [5,  6,  13,  19,  20,  26]. 

Several  factors  have  limited  the  incorporation  of 
access  control  mechanisms  in  mainstream  distributed 
object  systems  technology.  Object  models  are  hetero¬ 
geneous  in  nature  with  tremendous  semantic  diversity. 
Authorization  policies,  which  are  greatly  influenced  by 
the  specific  object  models  they  are  designed  to  protect, 
cannot  be  applied  to  other  object  models.  Moreover, 
most  access  control  mechanisms  are  brittle,  incapable 
of  supporting  multiple  policies. 

The  Meta-Object  Operating  System  Environment 
(MOOSE)  being  developed  at  the  University  of  Tulsa 
provides  a  framework  for  the  development  of  verifiably 
secure  heterogeneous  distributed  objects  with  a  com¬ 
bination  of  formal  methods  and  object  technology  [16]. 
This  paper  describes  a  capability-based  access  control 
architecture  for  distributed  objects.  This  architecture 


extends  an  earlier  incarnation  of  the  MOOSE  frame¬ 
work  with  a  ‘‘meta-level  integration”  of  primitive  se¬ 
curity  mechanisms  that  can  be  used  to  implement  a 
variety  of  authorization  models. 

Object-oriented  systems  share  a  common  theo¬ 
retical  underpinning  -  the  notion  of  a  meta-object. 
MOOSE  uses  the  meta-object  concept  to  model  di¬ 
verse  object-oriented  systems  and  to  provide  a  foun¬ 
dation  for  secure  interoperability.  The  Meta-Object 
Model  (MOM)  in  MOOSE  expresses  diverse  object- 
oriented  features  using  a  few  core  mechanisms.  By 
integrating  security  mechanisms  into  MOM,  it  is  pos¬ 
sible  to  capture  a  variety  of  authorization  models  for 
object-oriented  systems.  In  particular,  Access  Control 
Lists  (ACLs)  and  access  monitors  are  integrated  to 
permit  the  modeling  of  various  forms  of  Discretionary 
Access  Control  (DAC),  Mandatory  Access  Control 
(MAC)  and  Role-Based  Access  Control  (RBAC). 

Capabilities  provide  a  useful  mechanism  for  recon¬ 
ciling  the  inherent  differences  between  identity-based 
DAC,  label-based  MAC  and  RBAC.  Capabilities  are 
unforgeable  tokens  that  give  the  possessor  certain 
rights  to  an  object  [11,  21,  22].  Traditionally,  they 
have  been  used  to  keep  track  of  access  rights  at  run¬ 
time.  The  flexibility  of  capabilities  suggests  their  use 
as  a  meta-model  for  access  control  to  concurrent  ob¬ 
jects  operating  in  distributed  environments. 

This  paper  proposes  a  capability-based  access  con¬ 
trol  architecture  for  meta-objects  as  a  common  foun¬ 
dation  for  security  in  heterogeneous  distributed  ob¬ 
ject  systems.  The  paper  begins  with  an  overview  of 
access  control  in  object-oriented  systems  which  also 
motivates  the  work.  The  Meta-Object  Model  (MOM) 
and  the  access  control  architecture  are  described  in  de¬ 
tail  along  with  their  roles  in  capturing  heterogeneous 
access  control  models.  The  paper  concludes  with  a 
comparison  of  related  work  and  future  research  direc¬ 
tions. 


2  Access  Control  of  Objects 

Object-oriented  systems  are  composed  of  classes, 
instances,  attributes  (instance  variables)  and  meth¬ 
ods,  These  components  support  encapsulation,  modu¬ 
larity  and  re-use  through  message- passing,  inheritance 
and  aggregation. 

The  goal  of  access  control  is  to  protect  resources 
(objects)  from  unauthorized  access  by  users  (sub¬ 
jects).  An  authorization  state  is  a  function  State  : 
{Subject  X  Object  x  Privilege)  =>  Boolean^  where  the 
subject’s  privilege  is  the  access  type.  Often,  the  au¬ 
thorization  state  is  represented  as  a  list  of  authoriza¬ 
tion  tuples,  e.g.,  <  Subject,  Object,  access Jype  >. 
Each  tuple  declares  that  Subject  has  the  access Jype 
privilege  for  Object.  An  access  control  model  defines 
domains  for  subjects,  privileges  and  objects.  Also,  it 
defines  rules  for  implicit  authorization  and  specifies  a 
set  of  commands  to  take  systems  from  one  authoriza- 
tion  state  to  another. 

Although  access  control  is  fundamental  to  infor¬ 
mation  security  and  integrity,  authorization  mod- 
(‘Is  in  object-oriented  database  management  systems 
(OODB.MSs)  [23,  24,  27]  lack  a  common  perspective 
compared  with  those  for  relational  database  manage- 
lunw  systems  (RDBMSs).  In  fact,  many  OODBMSs 
do  not  provide  any  type  of  access  control.  The  seman- 
tir  diversity  of  object  models  is  partly  to  blame.  For 
•  xample.  some  object  models  support  multiple  inheri- 
taner.  wliile  others  do  not.  The  presence  of  competing 
authorization  models  also  introduces  problems.  Each 
modrl  addresses  protection  granularity,  access  types 
aiifl  imf)lirit  authorization  flow  for  objects  in  its  own 
vpecial  way.  The  dilemma  is  illustrated  using  a  simple 
»’irrtronic  commerce  example. 


figure  1.  Object-oriented  electronic  commerce  model. 


Consider  an  object-oriented  model  of  an  electronic 
commerce  system  shown  in  Figure  1.  The  protection 
granularity  specifies  the  finest  units  to  be  protected. 
In  relational  databcises,  these  range  from  tables  to  tu¬ 
ple  elements.  Classes,  objects,  attributes  and  methods 
are  all  viable  atomic  units  for  protection  in  an  object- 
oriented  system. 

Objects  could  be  chosen  as  the  finest  unit  of  pro¬ 
tection  in  this  example.  While  the  resulting  “all  or 
nothing”  access  to  objects  is  efficient  (only  one  au¬ 
thorization  set  is  needed  per  object),  the  approach 
is  rather  inflexible.  In  a  real-world  scenario,  a  cus¬ 
tomer  must  be  able  to  execute  Cashier: : Checkout () 
to  pay  for  items.  Using  objects  as  atomic  protection 
units  implies  that  Customer  objects  would  have  ac¬ 
cess  to  Cashier  objects.  But  this  would  allow  Cus¬ 
tomer  objects  to  access  a  potentially  sensitive  list 
of  Cashier  transactions  as  well.  The  inflexibility 
of  this  approach  forces  most  object  models  to  re¬ 
spect  individual  methods  and  attributes  as  atomic 
units  of  protection,  e.g.,  permitting  a  Customer  access 
to  Cashier: : Checkout (),  but  denying  her  access  to 
Cashier: : Trans act ions. 

Access  types  are  another  major  issue  in  object- 
oriented  systems.  Database  authorization  models 
make  use  of  read  and  write  permissions.  The  priv¬ 
ilege  of  modifying  rights  can  introduce  access  types 
(e.g.,  grant  and  revoke),  although  in  many  systems 
this  is  a  power  implicitly  held  by  owners  of  objects. 
Providing  grant  and  revoke  types  has  some  disad¬ 
vantages.  For  example,  the  ability  to  grant  can  be 
an  access  type  {grant-grant)  in  its  own  right.  Imple¬ 
menting  this  feature  usually  requires  a  self- referential 
access  type. 

Suppose  that  cashiers  need  to  read  customers’  ad¬ 
dresses,  but  should  not  be  able  to  modify  them.  Au¬ 
thorizing  <  Cashier  1,  Jody  ::  Address,  read  >  only 
gives  Cashierl  read  access  to  Jody: : Address.  If  at¬ 
tributes  can  be  read  or  written  directly  (without  using 
a  local  accessor  method),  access  types  read  and  write 
are  desirable.  However,  if  the  object  model  relies  on  lo¬ 
cal  accessor  methods  to  preserve  encapsulation,  then 
separate  methods  should  exist  for  reading  and  writ¬ 
ing  attributes  and  effective  read/write  control  can  be 
manifested  via  the  execute  privilege  on  those  accessor 
methods.  The  execute  type  for  method-based  access 
control  was  proposed  by  Fernandez  et  al  [15]. 

Implicit  authorization  is  a  convenient  way  of  propa¬ 
gating  permissions  and  protections  in  a  large  environ¬ 
ment.  The  idea  is  that  permissions/protections  can 
be  given  to  an  entire  class  of  subjects/objects  using 
one  authorization  rule.  The  application  of  this  no- 
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tion  to  object-oriented  systems  is  most  natural  be¬ 
cause  the  class  concept  is  fundamental  to  most  ob¬ 
ject  models.  An  instance  can  inherit  its  authorization 
status  from  its  class  in  the  same  way  that  it  inher¬ 
its  methods  and  attributes.  The  authorization  rule 
<  Customer,  Stockitem  ::  Price,  read  >  gives  all  cus¬ 
tomers  access  to  the  prices  of  all  items  in  stock. 

Generating  exceptions  to  authorization  rules  is  pos¬ 
sible  through  the  use  of  negative  authorizations  and 
strong/weak  authorizations.  A  negative  authorization 
explicitly  denies  a  subject  access  to  an  object.  For  ex¬ 
ample,  <  Chris,  Cashier  ::  CheckOut(),  > 

explicitly  prohibits  Chris  from  checking  out.  Obvi¬ 
ously,  the  mixture  of  positive  and  negative  authoriza¬ 
tions  can  lead  to  conflicts  in  an  authorization  model. 
Therefore,  authorization  in  such  schemes  is  usually  de¬ 
rived  by  labelling  rules  as  either  weak  or  strong.  Strong 
rules  cannot  be  overriden  by  weak  ones.  While  the 
presence  of  such  rules  makes  it  difficult  to  derive  an 
authorization  rule  for  a  particular  event,  they  allow 
for  highly  expressive  authorization  models. 


(b)  access  denied 


Figure  2.  Message  filter. 

Messages  are  the  principal  medium  of  communica- 
lion  in  object-oriented  systems.  Jajodia  et  al  [19] 
were  the  first  to  propose  that  messages  be  used  as  the 
focus  of  access  control  mechanisms  in  object  systems. 
Towards  this  end  they  introduced  a  message  filter  for 
d('ciding  whether  or  not  to  accept  a  message  on  behalf 
of  an  object  based  on  its  source,  content  and  desti¬ 
nation  (Figure  2).  A  message  filter  can  be  made  to 
reside  within  each  object,  providing  ubiquitous  access 
control  in  an  object-oriented  system.  This  decentral¬ 
ized  authorization  technique  is  superior  to  and  more 
natural  than  centralized  access  control  schemes  for  dis- 
tributed  object  systems. 

The  approach  to  access  control  of  objects  described 
in  this  paper  relies  on  a  decomposition  of  object  sys¬ 
tems  into  their  most  primitive  components.  Since 
message-passing  is  central  to  any  meta-object  model, 
the  message  filter  presents  itself  as  an  integral  piece 
of  an  architecture  for  unifying  access  control  in  het¬ 
erogeneous  distributed  object  systems.  The  following 
section  explores  access  control  for  meta-object  models 


and  presents  a  flexible  authorization  architecture  that 
is  easily  integrated  with  existing  object  models, 

3  Meta-Object  Access  Control 

This  section  describes  a  primitive  operational  foun¬ 
dation  for  access  control  in  object-oriented  systems.  It 
is  designed  to  be  integrated  at  the  meta-object  level  to 
permit  a  unified  treatment  of  meta-classes,  classes  and 
objects  [28].  The  main  requirement  of  the  meta-level 
access  control  model  is  that  it  be  flexible  enough  to 
support  multiple  access  control  policies  in  distributed 
computing  environments. 


^(SubcUusoQ^ 

(InsUnccoO 
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Figure  3.  Meta-object  electronic  commerce  model. 

Meta-object  models  provide  a  common  theoretical 
underpinning  for  object-oriented  systems.  They  are 
capable  of  presenting  a  unified  view  of  object-oriented 
features  such  as  classes,  subclasses  and  inheritance 
with  more  primitive  notions  of  meta-objects  and  dele¬ 
gation.  Meta-object  models  can  integrate  method  in¬ 
vocation,  message-passing  and  aggregation,  the  basic 
features  of  all  object  systems. 

Figure  3  illustrates  a  decomposition  of  the  object- 
oriented  electronic  commerce  model  into  a  meta¬ 
object  representation.  Note  that  all  classes  and  ob¬ 
jects  have  been  replaced  by  meta-objects.  Meta¬ 
objects  that  can  spawn  other  meta-objects  (using  a 
method  called  NewObjectO)  model  classes.  Further¬ 
more,  meta-objects  capable  of  spawning  “class”  meta¬ 
objects  are  meta-classes.  HetaClass  and  Class  are 
both  meta-classes. 

This  model  uses  capabilities  [11,  21,  22]  to  provide 
method-based  access  control  for  meta-objects.  A  ca¬ 
pability  is  an  unforgeable  token  that  a  subject  uses 
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SUBJECTl 

{a} 

- - 

read 

OBJECT 

keys:{a.c} 

locks:{a:read,  b:write} 

— - - ^ 

SUBJECT2 

read 

keysrfd.e} 

0  Access  Denied  - 

Figure  4 

.  Access  control  with  capabilities. 

as  a  ticket  for  object  access.  A  ticket,  which  is  also 

associated  with  an  access  type,  can  be  held  by  a  sub¬ 
ject  or  for  a  subject  by  a  trusted  third  party.  The 
ticket  is  inspected  by  an  object  or  by  the  trusted  third 
party  before  access  is  granted.  Alternatively,  capabil¬ 
ities  can  be  viewed  as  locks  and  keys\  this  view  implies 
that  objects  must  hold  matching  tokens.  Each  subject 
lias  a  number  of  keys  that  give  access  to  objects  with 
matching  locks.  This  analogy  is  illustrated  in  Fig¬ 
ure  4.  SUBJECTl  has  key  a  that  matches  a  read  lock 
held  bv  OBJECT  and,  therefore,  is  given  read  access 
to  OBJECT.  The  intersection  of  SUBJECT2’s  keys  and 
object's  locks  is  empty.  Therefore,  SUBJECT2  has  no 
acresh  to  OBJECT. 

Capabilities  typically  control  run-time  privilege  dis¬ 
tribution  between  processes  and  subprocedures.  They 
ran  be  used  to  implement  various  authorization  mod¬ 
els.  including  identity  and  group-based  Discretionary 
.•\rccss  Control  (DAC),  Role-Based  Access  Control 
(RB.AC)  and  Mandatory  Access  Control  (MAC).  This 
llc.\ibiliiy  makes  capabilities  ideal  for  meta-level  ac- 
re.ss  control  and  conducive  to  supporting  multipolicy 
functionality. 

In  general,  an  authorization  model  must  resolve 
(.s.o. o)  as  true  or  false  for  every  subject  (s),  object 
(o)  and  access  type  (a)  given  the  authorization  state. 
.An  authorization  state  is  defined  by  State  :  Object 
Prh’ilege  Token  — ^  Bool  where  Token  serves  as  a 

representative  for  the  subject. 

Recursive  definitions  are  used  to  specify  allowable 
access  types  in  the  primitive  access  control  model. 
This  is  necessary  to  implement  a  general  model  of 
grant  and  revoke  privileges.  The  recursive  access  types 
arc  defined  in  Figure  5.  The  type  definition  permits 
access  types  such  as  G.  G.  LOCK  (grant  grant  lock) 
and  G.  R.  G.  KEY  (grant  revoke  grant  key).  These 
types  can  be  thought  of  as  pertaining  to  token  lists. 
Note  that  every  type  other  than  KEY  behaves  as  a  lock, 


e.g.,  G.  R.  KEY  is  a  type  of  lock  that  guards  the  R. 
KEY  list.  Any  subject  with  KEY  token  matching  a  to¬ 
ken  associated  with  G.  R.  KEY  in  an  object  can  add 
(grant)  tokens  to  the  R.  KEY  list. 

ACCESS  TYPES 

prlv  :xK  ALL 

I  prlv2 

prlv2  Kxy 

I  LOCK 
i  o  .  priT2 
I  R  .  prlT2 


_ _ ACCESS  CONTROL  COMMANDS 

cam  add  ->  prlv  ->  tokan  ->  object  ->  object 

I  RXnoVB  ->  prlv  ->  token  ->  object  ->  object 
I  com  ;  com 


_ ACCESS  CONTROL  PREDICATES 

*VAL;  com  ->  etete  ->  state  ->  bool 

TRAMS;  State  ->  state  ->  bool 


Figure  5.  Access  control  definitions. 

The  ALL  privilege  in  Figure  5  confers  privileges  of 
every  type  to  a  subject.  A  subject  carrying  a  token 
with  the  ALL  privilege  on  some  object  can  add  or  re¬ 
move  any  type  of  authorization.  The  only  limitation 
is  that  the  subject  must  hold  the  actual  token  as  a 
key.  Rule  1  in  Figure  6  formalizes  the  semantics  for 
the  ALL  access  type. 

The  command  set  for  a  user  enables  alteration  of 
the  authorization  state.  This  set  comprises  commands 
for  adding  and  removing  authorization  tuples  to  and 
from  objects.  The  command  set  is  given  in  Figure  5. 
Subjects  can  only  add  or  remove  tokens  that  it  holds 
as  keys.  Taking  the  list-based  view,  this  means  that 
even  when  a  subject  has  a  grant  or  revoke  permis- 
sion  on  some  access  control  list,  the  only  tokens  it  is 
able  to  add  or  remove  are  those  that  it  holds  as  keys. 
The  third  command  allows  sequencing  of  ADD/REMOVE 
commands. 

Two  predicates,  EVAL  and  TRANS,  are  defined  on  au¬ 
thorization  states  (Figure  5).  EVAL  returns  true  when 
a  command  will  take  one  state  to  another.  It  is  used 
to  define  the  semantics  of  the  command  set.  TRANS 
returns  true  if  a  transition  between  states  is  possible. 
The  relationship  between  EVAL  and  TRANS  is  formal¬ 
ized  by  Rule  2  (Figure  6). 
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Rule  1  :  Vp  :  priv,  o  :  obj,  i  :  token,  $  :  state, 
s  o  ALL  t  =>  s  o  p  t 

Rule  2  :  V^i,  52*3c  :  comm. 

EVAL  c  Si  S2  TRANS  si  50 

Rule  3  :  Vp,  S2j  c>i,  02,  ^ 

Si  Oi  KEY  t  Asi  02  G.p  t  ^ 

(s2  02  p  t  A  (Vo',p',<'.  of  ^  02^  p'  ^  p 
yt^  z^t^  Si  o'  p' t'  =  S2  o'  p' t')  => 
EVAL  (ADD  pto2  Oi)  Si  S2) 

Rule  4  :  Vp,Si,S2, 01,02, 

Si  oi  KEY  t  Asi  02  R.p  t  => 

~>(s2  02  pt  A  (Vo',p',t'.  o'  ^  02  Vp'  ^  p 
\/i'  Z^t:=>  Si  o'  p'  t'  =  S2  o'  p'  t')  => 
EVAL  (REMOVE  p  <  02  Oi)  Si  S2) 

Rule  5  :  EVAL  (ci)  si  S2  A  (c2)s2  S3 
=>  EVAL  (ci;c2)  Si  S3 

Figure  6.  Access  control  rules. 

Rule  3  supplies  formal  semantics  to  the  ADD  com¬ 
mand.  This  rule  states  that  a  subject  must  have 
grant  privilege  over  an  access  type  in  an  object  to 
add  a  token  of  that  type  to  the  object.  The  constraint 
that  subjects  can  only  add  tokens  held  by  them  as 
keys  is  formalized.  For  example,  authorization  tuples 
<  o,G.R.L0CK,a  >,  <  s,KEY,a  >  and  <  s,KEY,b  > 
allow  the  command  ADD  R.LOCK  bos. 

The  semantics  for  the  REMOVE  command  (Rule  4) 
specify  when  it  is  legal  for  a  subject  to  remove  autho¬ 
rization  tuples.  Using  the  previous  example,  an  addi¬ 
tional  authorization  tuple  <  o,R.R.L0CK,  b  >  would  let 
6  remove  R.LOCK  permissions  from  o.  Again,  s  must 
hold  the  affected  token  as  a  key. 

Rule  5  introduces  command  sequences  to  the  sys¬ 
tem.  It  formalizes  the  transitive  nature  of  commands 
on  authorization  states. 

4  Authorization  in  the  Meta-Object 
Model 

The  Meta-Object  Model  (MOM)  in  MOOSE  is  a 
core  model  for  the  design  and  analysis  of  sophisticated 
distributed  object  systems.  MOM  is  augmented  with 
primitive  mechanisms  intended  to  support  a  spectrum 


Figure  7.  MOM  object  components. 


of  authorization  models  for  object-oriented  systems. 
MOM  will  permit  developers  to  construct  and  an¬ 
alyze  secure  object-oriented  programming  languages 
and  database  management  systems.  This  section  de¬ 
scribes  MOM  and  the  integration  of  capability- based 
access  control  primitives  in  MOM. 

MOM  is  defined  with  the  Robust  Object  Calculus 
(ROC),  a  process  calculus  tailored  to  modeling  dis¬ 
tributed  objects  [30].  The  underlying  principles  of 
ROC,  e.g.,  encapsulation  and  tuple- based  communi¬ 
cation,  have  facilitated  the  formal  design  of  MOM. 
The  design  of  MOM  is  influenced  by  ACTORS  [1]  and 
Concurrent  Aggregates  [7].  MOM  supports  core  ob¬ 
ject  functionality,  including  persistence,  method  invo¬ 
cation,  asynchronous  message-passing,  delegation  and 
aggregation.  Virtually  any  object  model  can  be  con¬ 
structed  from  this  core  functionality. 

Access  control  policies  are  modeled  in  MOM  sys¬ 
tems  with  object  access  control  lists  (OACLs)  and 
message  filters.  These  capability-based  mechanisms 
implement  flexible  and  ubiquitous  access  control  for 
objects  and  methods.  Objects  with  OACLs  and  mes¬ 
sage  filters  can  provide  authorization  services  to  do¬ 
mains  of  subobjects  when  efficiency  concerns  outweigh 
security  requirements. 

4.1  MOM  Objects 

MOM  objects  are  viewed  as  a  collection  of  tightly 
encapsulated  components  (processes).  Each  MOM  ob¬ 
ject  has  a  set  of  identifiers  that  defines  how  it  can  be 
addressed.  MOM  components  that  share  an  identifier 
are  said  to  be  part  of  the  same  object.  Identifiers  in 
MOM  are  navigational  tuples  of  names.  Navigational 


lOS 


identifiers  (nids)  are  defined  using  the  following  BNF 
rules. 

nid  [o/td#,  n*d#]  |  nil 

olid  [direction,  ltd#] 

direction  in  |  out 

A  MOM  system  entails  a  hierachical  structure  of 
object  domains,  specifying  objects  that  contain  ob¬ 
jects.  Every  MOM  system  contains  exactly  one  root 
object,  which  is  not  contained  by  any  other  object. 
The  root  object  is  used  in  a  bootstrapping  process 
that  initializes  MOM  systems.  Each  object  is  named 
by  a  local  identifier  (lid)  unique  to  its  domain.  The 
domain  of  an  object  is  its  parent  object. 

The  local  identifier  of  an  object  (say  06^1)  is 
prepended  to  the  global  identifier  (gid)  of  its  parent 
(say  root)  to  define  a  unique  gid  for  the  object,  e.g., 

[  [out,  06ii#  ]#,[  [out,  root#  ]#,  nil#  ]#  ].  Ob¬ 
jects  consist  of  a  number  of  cooperating  MOM  com¬ 
ponents:  an  object  registry,  a  message  handler  and 
methods.  The  message  handler  and  the  object  reg¬ 
istry  form  the  basis  for  an  object’s  identity,  controlling 
communication  for  a  tightly  bound  set  of  components. 
Furthermore,  objects  can  house  a  message  filter  and 
an  object  access  control  list  (OACL),  which  contains 
authorizations  for  access  to  the  object’s  components. 
The  message  filter  resides  in  the  message  handler  and 
authorizes  each  message  using  the  OACL.  MOM  ob¬ 
ject  components,  including  the  OACL  and  message 
filter,  are  illustrated  in  Figure  7. 

An  object  registry  maintains  a  record  for  each 
active  component  within  the  object.  In  particular, 
an  object  registry  keeps  track  of  message  handlers, 
method  interfaces  and  each  active  method  invocation. 
The  lid,  the  component  type  and  a  third  element  con¬ 
taining  miscellaneous  information  are  stored  within 
each  registry  record.  As  an  object  is  being  deleted 
it  must  refer  to  its  registry  so  that  it  can  gracefully 
delete  each  of  its  components.  Furthermore,  to  create 
an  object  inside  a  parent,  the  registry  is  checked  to 
see  that  the  new  lid  is  unique.  Only  then  is  the  object 
created  and  registered  with  the  parent  object. 

4.2  MOM  Message- Passing 

Messages  embody  asynchronous  communication  in 
MOM.  Messages  carry  method  invocation  requests, 
acknowledgements  and  replies  between  objects.  Once 
a  request  is  sent,  the  sender  can  continue  its  compu¬ 
tation  without  waiting  for  an  acknowledgement  or  re¬ 
ply,  i.e.,  all  communication  is  asynchronous.  MOM 
messages  are  categorized  as  method  invocation  re¬ 
quests  (request),  subsequent  replies  (reply)  or  ac¬ 
knowledgements  (ack). 


A  message  handler  processes  incoming  MOM  mes¬ 
sages  and  marshalls  object  requests.  It  controls  the 
distribution  of  requests  and  replies  for  a  tightly  bound 
set  of  components.  Figure  8  illustrates  the  creation 
and  acceptance  of  messages  by  message  handlers  for  a 
request/reply  sequence. 

An  incoming  message  can  be  received  as  a  local 
method  invocation  request  or  delegated  to  another 
object.  A  message  handler  ‘"delegates”  a  message 
by  consuming  it  and  re-creating  it  in  an  adjacent 
domain.  For  example,  the  root  domain’s  message 
handler  M Jtlndlrroot  consumes  the  reply  message 
creates  a  new  message  in 

A’s  domain. 

4.3  MOM  Methods 

The  method  architecure  in  MOM  is  composed  of 
three  types  of  communicating  processes  method  inter¬ 
faces,  method  arbiters  and  method  bodies.  Method  in¬ 
terface  components  accept  invocation  requests  propa¬ 
gated  by  message  handlers.  These  interfaces  control 
access  and  synchronization  for  individual  methods,  A 
unique  method  interface  exists  for  each  method  in  an 
object. 

Upon  receiving  an  invocation  request,  a  method  in¬ 
terface  will  spawn  a  method  arbiter  and  method  body. 
The  method  body  process  performs  the  actual  compu¬ 
tation,  while  the  method  arbiter  negotiates  commu¬ 
nications  between  the  method  body  and  the  outside 
world.  The  MOM  method  subsystem  is  shown  in  Fig¬ 
ure  9. 
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Figure  9.  MOM  method  scheme. 

The  MOM  toolkit  has  mutable  and  immutable 
methods.  A  mutable  method  invocation  spawns  a  per¬ 
sistent  process  with  state  that  can  be  accessed  many 
times.  Immutable  methods,  on  the  other  hand,  be¬ 
have  like  traditional  methods,  terminating  and  return¬ 
ing  a  value  upon  completion.  While  mutable  methods 
model  instance  variables,  it  is  a  MOM  standard  to 
create  (immutable)  accessor  methods  in  the  object  for 
instance  variables. 

Requests  to  an  immutable  method  spawn  a  new 
method  arbiter  and  body  to  support  concurrent 
method  invocation.  This  may  or  may  not  be  allowed 
by  the  method  interface  which  controls  synchroniza¬ 
tion,  A  mutable  method  interface  does  not  permit 
co-existing  invocations.  Only  one  body  and  arbiter 
can  be  active  at  a  time.  Requests  to  an  active  muta¬ 
ble  method  are  forwarded  to  the  method  arbiter.  All 
requests  for  a  mutable  method  are  forwarded  to  its 
arbiter  which  then  forwards  it  to  the  method  body. 
The  behavior  of  the  method  body  can  be  affected  by 
previous  requests;  this  models  state  information. 

Immutable  methods  behave  like  methods  in  conven¬ 
tional  object-oriented  programming  languages.  Each 
request  spawns  a  new  method  body  supporting  con¬ 
current  method  invocation.  The  completion  of  an  in¬ 
vocation  results  in  a  reply  from  the  method  to  the 
method  arbiter.  This  reply  can  be  propagated  back  to 
the  initiating  object  (modeling  a  traditional  method 
invocation)  or  it  can  go  elsewhere  as  indicated  by  the 
request. 

Method  interfaces  for  immutable  methods  accept 
invocation  requests  and  create  new  method  arbiters 


and  bodies  for  each  request.  Method  arbiters  manage 
individual  method  bodies.  For  immutable  methods 
this  entails  passing  parameters  to  a  method  body  and 
waiting  for  a  reply.  Method  bodies  are  also  defined 
with  the  restriction  that  they  be  properly  encapsu¬ 
lated  to  avoid  interference  with  other  MOM  compo¬ 
nents.  A  method  body,  once  created  must  receive  a 
request  sent  by  its  parent  arbiter.  A  method  body 
can  in  turn  invoke  other  methods,  including  those  in 
foreign  objects. 

An  immutable  method  must  formulate  an  appro¬ 
priate  reply  to  complete  an  invocation.  A  parame¬ 
ter  passed  to  the  method  body  determines  the  correct 
type  of  reply;  this  can  be  noreply  if  no  reply  is  nec¬ 
essary,  nil  to  receive  a  normal  reply,  or  it  could  be  the 
nid  of  another  object. 

Component  creation  and  deletion  is  handled  by  spe¬ 
cial  methods,  NewMethod,  NewObject,  DeleteMethod, 
and  DeleteObject,  that  are  resident  in  each  object. 
Objects  invoke  these  methods  the  way  they  would  any 
other  method.  For  example,  a  foreign  object  might  is¬ 
sue  a  request  to  another  object  to  create  a  subobject. 
However,  methods  might  refuse  this  request  if  a  con¬ 
flict  exists  in  the  object  registry  (e.g.,  a  subobject  of 
the  same  name  exists)  or  even  as  a  matter  of  policy 
(e.g,,  only  ancestors  can  delete  object  components). 

4,4  MOM  Security 

OACLs  and  message  filters  implement  access  con¬ 
trol  in  MOM  systems.  Each  meta-object  can  con¬ 
tain  an  OACL  and  a  message  filter.  Each  OACL 
contains  a  list  of  authorization  tuples  of  the  form 
<  component,  accesstype, token  >.  Message  filters 
use  these  lists  to  authorize  messages  received  by  mes¬ 
sage  handlers.  A  message  contains  source,  destina¬ 
tion  and  content  information.  The  content  informa¬ 
tion  specifies  the  type  of  access,  e.g.,  G.  LOCK,  as  well 
as  a  set  of  tokens  provided  by  the  message  originator 
to  be  used  as  keys  in  the  authorization  process. 

Method-based  access  control  is  specified  by  autho¬ 
rization  tuples  that  contain  method  names  in  the  com¬ 
ponent  field.  Authorization  commands,  e.g.,  G.  KEY, 
can  also  be  issued  in  messages  to  destination  objects. 
However,  filtration  can  occur  at  the  destination  and  at 
all  objects  along  the  message  route.  Therefore,  dele¬ 
gating  a  message  between  objects  must  be  authorized. 
This  feature  protects  entire  objects. 

A  message  filter  examines  a  message  and  compares 
it  with  its  OACL  to  determine  authorization.  If  the 
message  source  holds  a  key  matching  a  lock  held  by 
the  intended  component  recipient,  the  message  is  au¬ 
thorized.  This  simple  scheme  functions  in  the  realm 
of  object-oriented  programming  languages  and  is  par- 
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Figure  10.  OACL  and  message  filter  components. 


ticularly  well-suited  to  object-oriented  databases  and 
distributed  object  systems.  Figure  10  shows  the  com¬ 
ponents  of  a  MOM  object  with  the  OACL  and  message 
filter. 

MOM  objects  affect  the  access  control  scheme  by 
introducing  intervening  filters.  Assume  that  object  s 
has  legitimate  access  to  object  o  by  virtue  of  s  having 
a  key  matching  one  of  o^s  locks.  Object  s  could  be 
denied  access  to  o  if  an  intervening  object  i  exists  on 
the  message  path  between  s  and  o.  This  can  occur 
because  messages  must  be  authorized  to  pass  through 
each  domain  between  source  and  destination.  If  s  is 
not  allowed  to  send  messages  to  z,  the  message  will 
be  returned  at  that  point  and  access  to  o  will  be  ef¬ 
fectively  denied.  Intervening  objects  complicate  the 
authorization  architecture  but  facilitate  specialization 
of  authorizations. 

The  root  object  contains  all  other  objects  and  plays 
an  important  role  in  the  bootstrapping  process  by  cre¬ 
ating  meta-objects  and  initializing  the  authorization 
state.  Once  meta-objects  are  created  they  in  turn 
spawn  objects  and  propagate  authorizations  when  nec¬ 
essary.  Classes  are  meta-objects  with  special  meth¬ 
ods  that  construct  instance  objects  of  the  same  type. 
Authorizations  can  be  inherited  by  instances  or  sub¬ 
classes  via  token  propagation  and  runtime  delegation. 
These  techniques  manifest  implicit  authorization  flow 
in  object-oriented  systems.  User-controlled  objects 
manifest  explicit  authorization  at  runtime  by  invok- 
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Figure  11.  MOM  electronic  commerce. 


ing  methods  containing  authorization  commands. 

Consider  the  MOM  representation  of  the  electronic 
commerce  model  illustrated  in  Figure  11.  It  is  comple¬ 
mented  by  a  partial  view  of  the  virtual  global  OACL  in 
Figure  12.  The  global  OACL  shows  the  authorization 
state  after  the  bootstrapping  process. 

No  customer  should  be  able  to  access  the  informa¬ 
tion  of  another  customer.  This  constraint  is  enforced 
by  ensuring  that  no  instance  of  Customer  has  a  key 
that  matches  a  lock  in  the  OACL  of  the  root  ob¬ 
ject  (which  contains  all  customer  instances).  Figure 
11  shows  the  result  of  John  trying  to  access  Jody. 
Note  that  access  to  Jody  and  its  subobjects  is  de¬ 
nied  because  the  filter  in  root  intervenes.  The  filter 
checks  the  OACL  in  root  and  discovers  that  John  does 
not  have  permission  to  send  messages  to  (or  through) 
Jody. 

Another  reasonable  policy  is  that  customers 
should  be  able  to  read,  but  not  modify,  stock 
item  prices.  Therefore,  the  authorization  tuple 
<  ReadPrice(),LOCK,  read^rice  >  should  be  con¬ 
tained  in  the  OACL  for  the  meta-object  (class) 
Stock  Item.  Customer  access  to  prices  through 
Customer:  :ReadPrice()  implies  that  the  OACL  for 
meta-object  (class)  Customer  contains  the  authoriza¬ 
tion  tuple  <  ReadPrice(),KEY,read4>rice  >.  The 
authorization  commands  to  add  these  tuples  are  given 
by  the  root  object  in  the  bootstrapping  process. 

Delegations  must  be  authorized  in  the  same  way 
that  method  invocations  are  authorized.  Meta¬ 
objects  are  responsible  for  propagating  delegation  au¬ 
thorizations  to  instances.  Stock  Item  must  prop¬ 
agate  <  ReadPrice(),L0CK,read4)rice  >  to  each 
of  its  instances  and/or  subclasses.  This  prop- 
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;mation  is  performed  in  the  constructor  Stock 
Item: iNewObject (). 

W  lien  an  instance  of  Customer  reads  a  price,  it 
(h  logatos  up  to  Customer:  :ReadPrice()  which  con¬ 
tains  ih(‘  correct  key  for  authorization  to  stock  item 
prav>  In  particular,  Figure  11  shows  the  method  in- 
\ oral  ion  chain  when  Mark:  :GetPrice()  is  invoked  for 
12  At  liic  other  end  of  this  method  invocation  re- 
Customer :  :ReadPr ice 0  invokes  an  accessor 
function  local  to  12,  which  is  then  delegated  to  an 
accessor  method  in  Stock  Item. 

Authorizations  for  method-bcised  access  can  be  spe¬ 
cialized  in  the  same  manner  that  methods  are  spe- 
<’ializc(l.  The  authorization  to  delegate  is  held  by 
ill*  delegating  object  (the  instance  or  subclass),  fa¬ 
cilitating  authorization  specialization.  For  example, 
the  accessor  method  in  II  could  choose  not  to  del- 
cgat<'  to  its  parent  class,  but  instead  define  its  own 
behavior  and/or  authorization  set.  This  is  illustrated 
in  Figure  11  where  Chris  attempts  to  get  the  price 
of  II.  but  is  denied  because  the  authorization  tuple 
<  ReadPrice(), LOCK,  read_pr ice  >  for  delegation  to 
the  parent  class  is  missing  (Figure  12). 

Flliciency  is  a  primary  concern  when  message  fil¬ 
ters  are  used  to  implement  access  control.  The  pro- 
loosed  architecture  permits  objects  to  provide  autho¬ 
rization  services  for  entire  domains.  This  obviates  the 
use  of  message  filters  and  OACLs  for  each  and  every 
object  in  MOM,  resulting  in  a  flexible  and  potentially 
lightweight  access  control  system. 


5  Related  Work 

The  approach  taken  by  this  work  is  motivated 
by  the  object  systems  LOOPS  [28]  and  ACTORS 
[1].  Both  systems  probed  the  foundations  of  the 
object  paradigm  and  developed  sophisticated  meta¬ 
object  models.  The  meta-level  access  control  mecha¬ 
nisms  embedded  in  the  Meta-Object  Model  (MOM) 
described  in  this  paper  support  multipolicy  access 
control  in  heterogeneous  object  systems.  The  addi¬ 
tional  consideration  of  decentralized  authorization  is 
geared  toward  distributed  objects.  The  meta-level  ac¬ 
cess  control  architecture  relies  heavily  on  three  con¬ 
cepts:  capabilities,  message  filters  and  method-based 
access  control.  Integrating  these  features  in  a  meta¬ 
object  model  results  in  a  rich  framework  for  expressing 
a  variety  of  authorization  models  for  object-oriented 
systems. 

Research  efforts  in  object-oriented  systems  and 
in  database  security  have  also  influenced  our  work 
[3,  8,  9,  10,  12,  17,  29,  31].  The  ORION/ITASCA  sys¬ 
tem  adopts  a  model  of  discretionary  access  control  for 
objects  that  embraces  notions  of  explicit/implicit,  pos¬ 
itive/negative  and  weak/strong  authorizations  [24]. 
The  model  is  based  on  four  fundamental  access  types 
and  also  incorporates  roles.  An  extension  to  this 
model  described  in  [5]  supports  additional  access  types 
and  the  modeling  of  type  dependencies.  The  extended 
model  also  clarifies  the  semantics  of  subject  groups 
and  considers  object  versions  and  the  potential  for 
distributed  authorization  control.  Type  definitions  in 
our  model  are  easily  extended  to  positive/negative  and 
weak/strong  authorizations;  this  would  require  mod¬ 
ifying  components  to  handle  the  new  types.  An  im¬ 
portant  advantage  of  our  model  is  that  semantic-based 
forms  of  implicit  authorization  emerge  in  any  object- 
oriented  system  designed  using  MOM. 

Fernandez  introduced  method-based  access  control 
for  object-oriented  systems  [15].  By  using  methods 
as  a  basis  for  access  control,  first-order  access  types 
(e.g.,  read  and  write)  are  reduced  to  a  single  execute 
type.  We  have  chosen  method-based  access  control  be¬ 
cause  it  complements  MOM  nicely.  MOM  stipulates 
that  access  always  occurs  through  a  method  invoca¬ 
tion;  for  example,  reads  and  writes  are  implemented 
using  specialized  accessor  methods.  This  facilitates 
policy  specifications  that  are  emergent  from  meta-level 
access  control  primitives  and  MOM. 

Multipolicy  access  control  is  becoming  an  impor¬ 
tant  area  of  research  [2,  4,  18].  The  mechanisms  and 
models  provided  by  multipolicy  systems  enable  users 
to  protect  each  object  according  to  a  different  pol¬ 
icy.  The  architecture  described  in  [4]  employs  flexible 
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access  control  mechanisms  and  mediators  [32],  Media¬ 
tors  are  used  to  shape  the  access  control  mechanism  so 
that  it  can  enforce  user-specified  access  control  poli¬ 
cies.  Our  work  in  the  area  of  multipolicy  systems  seeks 
a  common  ground  for  access  control  mechanisms  that 
can  support  the  interoperation  of  disparate  authoriza¬ 
tion  policies. 

The  Argos  access  control  system  [20]  also  shares 
a  goal  with  the  work  presented  here  -  that  of  de¬ 
veloping  a  unified  view  of  heterogeneous  access  con¬ 
trol  models  in  open  distributed  environments.  Argos 
achieves  this  goal  by  incorporating  features  of  vari¬ 
ous  identity-based  authorization  models.  Specifically, 
it  models  implicit  authorization  flow  and  introduces 
domains  that  are  used  to  generate  classes  of  behavior 
and/or  protection  rights.  These  features  exploit  the 
semantic  richness  of  the  object-oriented  paradigm  to 
create  a  flexible  authorization  model.  Our  approach 
is  different  in  that  it  decomposes  object  behavior  into 
primitive  mechanisms.  Using  capabilities  is  also  more 
general  -  a  capability  can  be  an  identity,  a  group,  a  la¬ 
bel,  a  role,  or  an  unforgeable  ticket  in  an  anonymous 
transaction. 

The  Distributed  Computing  Environment  (DCE) 
requires  decentralized  authorization  services  to  cope 
with  the  difficulties  inherent  in  open  distributed  envi¬ 
ronments.  DCE’s  authorization  service  associates  ac¬ 
cess  control  lists  with  servers,  files  and  records,  spec¬ 
ifying  legal  operations  for  each  user  [14,  25].  Princi¬ 
pals  (subjects)  are  registered  in  a  database  and  as- 
signed  group  and  organization  membership.  A  mem¬ 
ber’s  name,  group  and  organization  information  define 
the  member’s  privilege  attributes.  The  authorization 
service  works  in  concert  with  DCE’s  authentication 
service.  A  member’s  privilege  attributes  are  embed¬ 
ded  in  a  ticket  provided  by  the  authentication  server 
at  login.  An  ACL  manager  resides  on  each  file  server 
authorizing  access  requests.  DCE  supports  various 
kinds  of  authorization  models  by  allowing  customiza¬ 
tion  of  ACL  managers.  However,  customization  is 
solely  the  responsibility  of  the  application  developer. 
While  DCE  does  not  deal  directly  with  object-oriented 
and  multipolicy  access  control  issues,  it  is  an  attempt 
to  provide  ubiquitous,  yet  practical,  access  control  in 
distributed  systems.  MOM  must  achieve  this  goal  if 
it  is  to  provide  a  common  secure  substrate  for  hetero¬ 
geneous  distributed  objects. 

6  Conclusions 

The  meta-level  authorization  service  architecture 
presented  in  this  paper  integrates  primitive  capability- 
based  access  control  mechanisms  within  a  meta-object 
model  for  maximal  support  of  multiple  policies  in  het¬ 


erogeneous  object  systems.  The  meta-object  model 
engages  message  filters  and  method-based  access  con¬ 
trol,  although  access  control  for  objects  is  also  possi¬ 
ble.  Access  control  in  this  model  can  be  ubiquitous, 
where  each  object  is  responsible  for  its  own  autho¬ 
rization  policy.  It  can  also  be  lightweight,  where  ob¬ 
jects  provide  authorization  services  for  entire  object 
domains.  Implemented  in  the  Meta-Object  Model 
(MOM),  this  authorization  service  architecture  pro¬ 
vides  a  common  foundation  for  the  secure  interopera¬ 
tion  of  heterogeneous  distributed  objects. 
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Abstract 

Workflow  management  systems  (WFMS)  support  the 
modeling  and  coordinated  execution  of  processes  within 
an  organization.  To  coordinate  the  execution  of  the  var- 
ions  activities  (or  tasks)  in  a  workflow,  task  dependen- 
ci€.<i  are  specified  among  them.  As  advances  in  workflow 
management  take  place,  they  are  also  required  to  sup¬ 
port  security.  In  a  multilevel  secure  (MLS)  workflow, 
ta.sks  may  belong  to  different  security  levels.  Ensuring 
the  task  dependencies  from  the  tasks  at  higher  security 
levels  to  those  at  lower  security  level  (high-to-low  depen¬ 
dencies)  may  compromise  security.  In  this  paper,  we 
consider  such  MLS  workflows  and  show  how  they  can 
b<  executed  in  a  secure  and  correct  manner.  Our  ap¬ 
proach  IS  based  on  semantic  classification  of  the  task  de¬ 
pendencies  that  examines  the  source  of  the  task  depen¬ 
dencies.  We  classify  the  high-to-low  dependencies  in  sev¬ 
eral  ways:  conflicting  vs  conflict-free,  result-independent 
vs  re.  suit- dependent,  strong  vs  weak,  and  abortive  non- 
abortive.  We  propose  algorithms  to  automatically  re¬ 
design  the  workflow  and  demonstrate  that  only  a  small 
subset  among  all  the  types  of  high-to-low  dependencies 
requires  to  be  executed  by  trusted  subjects  and  all  other 
types  can  be  executed  without  compromising  security. 

The  solutions  proposed  in  this  paper  are  directly  ap¬ 
plicable  to  another  relevant  area  of  research  —  execution 
of  multilevel  transactions  in  multilevel  secure  databases 
.5i7icr  the  atomicity  requirements  and  other  semantic  re¬ 
quirements  can  be  modeled  as  a  workflow.  When  com¬ 
pared  to  the  research  in  this  area,  our  work  (1 )  is  more 
general  in  the  sense  that  it  can  model  several  other  types 
of  dependencies  thereby  allowing  one  to  specify  relaxed 
atomicity  requirements  and  (2)  is  capable  of  automati¬ 
cally  redesigning  a  workflow  without  requiring  any  hu¬ 
man  intervention  by  eliminating  some  cycles  among  task 
dependencies  that  helps  to  attain  higher  degree  of  atom¬ 
icity. 

1  Introduction 

Workflow  management  is  a  new  and  emerging  area 
of  research.  Workflow  management  systems  (WFMS) 
support  the  modeling  and  coordinated  execution  of  pro- 
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cesses  within  an  organization.  WFMS  represent  today 
an  important,  inter-disciplinary  area  which  is  commer¬ 
cially  very  significant,  as  witnessed  by  the  large  number 
of  available  products  and  by  the  standardization  effort 
undertaken  by  the  Workflow  Management  Coalition  or¬ 
ganization.  The  resison  why  WFMS  are  becoming  in¬ 
creasingly  important  is  because  from  an  enterprise  point 
of  view  the  effective  management  of  business  processes 
is  becoming  increasingly  crucied.  Business  processes  con¬ 
trol  which  piece  of  work  will  be  performed  by 

whom  and  which  resources  are  required  and  used  to  ac¬ 
complish  this  task.  Therefore,  a  business  process  spec¬ 
ifies  how  a  certain  organization  will  achieve  its  goals. 
Optimizing  such  a  process  is  crucial  in  today’s  compet¬ 
itive  world.  Very  often,  the  use  of  WFMS  is  connected 
with  business  process  reengineering  by  which  business 
processes  are  redesigned  to  achieve  significant  improve¬ 
ments  in  critical  factors  such  as  cost,  quality,  service 
and  speed.  Several  applications  are  already  supported 
by  WFMS,  including  insurance  policy/claims  process¬ 
ing,  travel  expense  approvals,  healthcare  management, 
system  monitoring  and  exception  handling,  just  to  name 
a  few. 

To  coordinate  the  execution  of  the  various  activities 
(or  tasks)  in  a  workflow,  a  set  of  constraints  called  the 
task  dependencies  are  specified  among  them.  Task  de¬ 
pendencies  represent  a  key  component  in  ensuring  the 
flexibility  required  to  support  exceptions,  alternatives, 
compensations  and  so  on,  which  all  arise  in  real-life  ac¬ 
tivities.  An  example  of  constraint  is  to  specify  that  a 
certain  task  must  be  aborted  if  another  task  is  aborted. 
Such  a  constraint  models  the  fact  that  if  the  latter  task 
is  not  successfully  completed,  the  former  task  is  useless 
and  therefore  must  be  aborted.  The  development  of  flex¬ 
ible  and  powerful  WFMS  entails  many  important  issues. 
These  systems  are  thus  continuously  evolving  in  order 
to  better  satisfy  application  requirements.  In  particular, 
as  advances  in  WFMS  take  place  and  their  application 
scope  widens,  they  are  also  required  to  support  security, 
meaning  that  coordination  among  processes  at  different 
security  levels  has  to  be  supported,  indeed  without  vio¬ 
lating  security. 
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In  order  to  ensure  correctness  and  reliability,  work- 
flows  are  associated  with  a  workflow  iransaciion 
model  [9].  It  is  important  to  note  that  workflow  transac¬ 
tion  models  must  somehow  be  based  on  more  “flexible” 
correctness  criteria  than  traditional  transaction  models. 
For  example,  the  classical  “all-or-nothing”  property  of 
the  traditional  transaction  models  is  not  appropriate  for 
workflow  transactions.  Such  a  workflow  transaction  may 
need  to  commit  some  of  its  actions,  while  aborting  other 
actions.  To  satisfy  such  flexibility  requirements,  a  large 
number  of  workflow  transaction  models  have  been  pro¬ 
posed.  Despite  the  flurry  of  research  and  development 
work  around  workflow  transaction  models,  security  for 
such  transaction  models  has  not  been  addressed  yet.  In 
a  multilevel  secure  workflow  transaction  (MLS  workflow 
in  short),  tasks  may  belong  to  different  security  levels. 
Thus  ensuring  all  the  task  dependencies,  especially  those 
from  a  task  at  a  higher  security  level  to  that  at  a  lower 
security  level,  may  compromise  security.  It  is  easy  to  un¬ 
derstand  that  in  a  multilevel  environment  it  is  not  possi¬ 
ble  to  force  the  abort  of  a  lower  level  task  upon  the  abort 
of  a  higher  level  task.  The  goal  of  the  work  we  present 
heri!  IS  to  consider  MLS  workflows  and  show  how  they 
can  be  executed  in  a  secure  and  correct  manner. 

Our  approach  begins  with  a  semantic  classification  of 
th<;  task  dependencies  which  is  based  on  a  close  exami¬ 
nation  of  the  source  of  the  task  dependencies.  We  argue 
that  only  certain  types  of  dependencies  can  occur  in  MLS 
workflows.  Then  we  propose  algorithms  to  automatically 
(without  human  intervention)  redesign  the  workflow  in 
.sucli  a  way  that  it  can  be  executed  in  a  secure  and  cor¬ 
rect  manner.  In  particular,  our  approach  focuses  on  re¬ 
designing  dependencies  from  higher  level  tasks  to  those 
at  lower  level  because  they  are  the  cause  for  a  potential 
covert  channel. 

The  remainder  of  the  paper  is  organized  as  follows. 
Section  2  reviews  the  workflow  and  security  models  and 
develops  the  necessary  definitions  to  formalize  our  ap¬ 
proach.  Section  3  presents  the  multilevel  secure  work- 
flow  model.  Section  4  provides  an  approach  to  identify 
the  various  types  of  task  dependencies  based  on  the  se¬ 
mantics  of  the  dependencies.  Section  5  shows  how  all  the 
types  of  dependencies  can  be  enforced  without  compro¬ 
mising  security.  Finally  section  6  provides  conclusions. 
Proofs  of  the  theorems  are  presented  in  the  appendix. 

2  The  Model 

In  this  section  we  introduce  the  basic  elements  of  our 

workflow  model  and  we  summarize  the  security  model 
wc  assume. 

2.1  The  Workflow  Model 

A  workflow  is  a  set  of  tasks  with  task  dependencies  de- 
fined  among  them.  A  task  in  its  simplest  form  consists 
of  a  set  of  data  operations  and  task  primitives  {begin 
abort,  commit}.  Execution  of  a  task,  in  addition  to  in¬ 
voking  operations  on  data  items  (either  read  or  write), 


requires  invocation  of  these  task  primitives.  All  data 
operations  in  a  task  must  be  executed  only  after  the  be¬ 
gin  primitive  is  issued.  All  tasks  must  end  with  either  a 
commit  or  abort. 

A  primitive  may  move  a  task  from  one  state  to  an- 
other.  A  task  (t,  )  can  be  in  one  of  the  following  states: 
initial  state  (in,),  execution  state  (ex,),  commit  state 
(cm,)  or  abort  state  (a6,).  (We  use  6„a,  and  c,-  to  de¬ 
note  the  begin,  abort  and  commit  primitives  oft,.)  For 
instance,  a  task  may  move  from  its  initial  state  to  the 
execution  state  by  invoking  the  begin  primitive. 

To  control  the  coordination  among  different  tasks,  de¬ 
pendencies  are  specified  based  on  these  task  primitives. 
Task  dependencies  in  turn  can  either  be  static  or  dy¬ 
namic  in  nature.  In  the  static  case,  the  workflow  is  de¬ 
fined  well  in  advance  to  its  actual  execution,  whereas  dy¬ 
namic  dependencies  develop  as  the  workflow  progresses 
through  its  execution  [14].  Task  dependencies  may  exist 
among  tasks  within  a  workflow  (intra-workflow)  or  be¬ 
tween  two  different  workflows  (inter-workflow).  In  [13], 
three  basic  types  of  task  dependencies  have  been  identi¬ 
fied:  control-flow  dependencies,  value  dependencies  and 
external  dependencies.  Control-flow  dependencies  may 
in  turn  involve  explicit  transmission  of  data  as  part  of 
the  result  of  a  task.  We  call  such  dependencies  control- 
flow  dependencies  with  data  flow.  In  the  remainder  of 
this  section,  we  briefly  discuss  the  four  types  of  depen¬ 
dencies  and  introduce  some  basic  definitions  concerning 
workflows. 

2.1.1  Control-flow  Dependencies 

A  control-flow  dependency  can  be  defined  as  follows: 

A  tasktj  can  enter  state  stj  only  after  task  U  enters  state 
sti.  Control-flow  dependencies  can  be  modeled  based 
on  the  ACTA  framework  [6], 

Given  two  extended  transactions  t,-  and  tj,  a  list  of 
possible  control-flow  dependencies  is  presented  below. 


^  Or  t,  commirs 

then  tj  must  commit  (represented  as  <,•  U) 

2.  Abort  Dependency:  A  task  tj  must  abort  if  t,-  aborts 
(represented  as  t,  ij), 

3.  Termination  Dependency:  A  task  tj  can  terminate 
(either  commit  or  abort)  only  after  the  completion 

^omndt  or  abort)  of  f,  (represented  as  f,-  — t, ) 

4.  Begin  Dependency:  A  task  tj  cannot  begin  until’t, 

has  begun  (repr^ented  as  t,-  t,). 

5.  Begin-on-Commit  Dependency:  A  task  tj  ccinnot 

begin  until  t,  commits  (represented  as  t,-  <,) 

6.  Force  Begin-on-abort  Dependency:  A  task  tj  must 

begin  if  t,  aborts  (represented  as  t,-  t,  ). 

7.  Exclusion:  Given  any  two  tasks  t,  and  tj  ,  if  if  com¬ 
mits  tj  must  abort,  or  vice  versa  (represented  as 
ti  tj). 

8.  Weak  begin-on-commit:  Given  any  two  tasks  t,  and 
tj,  tj  can  begin  iff,  commits,  (represented  as  t,  ^ 

tj). 
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9.  Group  Commit:  Given  any  two  tasks  t,  and  tj,  ei¬ 
ther  both  ti  and  tj  commit  or  neither  commits  [4]* 
(represented  as  tj  or  tj  t,). 

A  comprehensive  list  of  task  dependencies  based  on 
the  three  task  primitives,  namely,  begin,  commit  and 
abort,  can  be  found  in  [8,  6],  which  include  com¬ 
mit,  weak-abort,  force-commit-on-abort,  serial,  begin- 
on-abort  and  weak-begin-on-commit  dependencies.  In 
general,  according  to  the  logical  nature  of  dependency, 
they  can  be  either  strong  or  weak, 

DeRnition  1  Given  a  control-flow  dependency  ^ 
tj,  if  the  dependency  implies  a  logical  relationship  sU 
stj  (sti  <=:  stj),  we  say  that  it  is  weak  (strong),  □ 

According  to  the  above  definition,  a  strong  depen¬ 
dency  specifies  a  logical  relationship  such  that  tj  can 
enter  state  stj  only  if  task  U  enters  state  sU  (e.g.,  be,  b, 
t,  gc  etc.).  On  the  other  hand,  a  weak  dependency  states 
that  if  ti  enters  state  sti  then  tj  can /must  enter  state 
stj,  but  tj  can  enter  stj  even  t,-  has  not  entered  sti 
sc,  a,  fba,  wba,  wbc,  c,  e,  etc.).  That  is,  the  semantic 
difference  between  the  strong  and  weak  dependencies  is 
that  thv.  former  specifies  the  necessary  condition  for  tj 
to  enter  ,stj  whereas  the  latter  the  sufficient  condition. 

DeRnition  2  Given  a  control-flow  dependency  f,-  — ^ 
tj,  the  dependency  is  of  type  abortive  if  stj  is  ahj;  oth¬ 
erwise  it  is  non-abortive,  □ 

The  classification  given  by  the  above  definition  ap¬ 
plies  to  both  strong  and  weak  dependencies.  For  exam¬ 
ple,  abortive  type  dependencies  include  abort,  exclusion 
dependency,  etc,  whereas  non-abortive  type  dependen¬ 
cies  are  commit,  strong  commit,  begin  on  commit,  serial, 
begin  on  abort,  etc.. 


No-Force- Commit  and  No-Prevent- Abort 
Assumptions: 

When  applying  those  dependency  definitions  to  the 
workflow  environment,  two  tacit  assumptions  must  be 
made  because  of  the  relaxed  atomicity  requirement. 

No-Force- Commit  (NFC)  Assumption:  No  task  exe¬ 
cution  can  be  guaranteed  to  commit.  However,  a  task 
can  be  forced  to  begin  or  abort. 

No- Prevent- Abort  (NPA)  Assumption:  No  task  execu¬ 
tion  can  be  prevented  from  aborting.  However,  a  task 
can  be  prevented  from  beginning  or  committing. 

Further  examination  of  these  dependencies  reveals  that 
all  abortive  type  workflow  dependencies  must  be  weak, 
however,  non-abortive  type  workflow  dependencies  can 

•Group  commit  involving  a  set  of  tasks  can  be  defined  using 
pairwise  group  dependencies. 


be  either  weak  or  strong.  For  instance,  some  weak  non¬ 
abortive  type  dependencies  such  as  sc,  force-commit-on- 
abort  (fca)  (tj  must  commit  if  t,-  aborts),  and  gc,  etc. 
violate  the  NFC  assumption  and  all  strong  abortive  t3rpe 
dependencies  such  as  termination  dependency  (t)  violate 
the  NPA  assumption. 

2.1.2  Control-flow  Dependencies  with 
Data-flow 

A  control-flow  dependency  with  data-flow  cam  be  de¬ 
fined  as  follows:  A  task  tj  can  enter  state  stj  after  task  ti 
enters  state  sti  o,nd  U  passes  values  of  data  objects  to  tj . 
In  these  dependencies,  in  addition  to  the  control  flow, 
there  could  even  be  information  flow  (or  data  flow)  be¬ 
tween  the  tasks  where  a  taisk  needs  to  wait  for  data  from 
another  task.  Notice  that  control-flow  dependency  with 
data-flow  is  meaningful  only  for  limited  combinations  of 
sti  and  stj.  For  example,  sti  and  stj  can  be  “commit” 
and  “begin,”  respectively,  but  cannot  be  “begin”  and 
“commit.” 

2.1.3  Value  Dependencies 

A  value  dependency  can  be  defined  as  follows:  A  task 
tj  can  enter  state  stj  only  after  task  ti  ^s  outcome  satisfies 
a  condition  Ci .  The  condition  in  the  above  statement  can 
be  a  logical  expression  whose  value  is  either  0  or  1.  Note 
that  this  dependency  is  different  from  the  control-flow 
dependency  with  the  data  flow.  For  example,  Hj  can 
begin  if  ti  is  a  success  (semantically).”^ 

2.1.4  External  Dependencies 

Unlike  the  prior  two  types,  external  dependencies  are 
caused  by  some  parameters  external  to  the  system,  such 
as  time.  An  external  dependency  can  be  defined  as  fol¬ 
lows:  A  task  ti  can  enter  state  sti  only  after  if  a  certain 
condition  Cj  is  satisfied  where  ike  parameters  in  Cj  are 
external  to  the  workflow.  Examples  include  a  task  t,-  can 
start  its  execution  only  at  9;00am  or  task  tj  can  start 
execution  only  24hrs  after  the  completion  of  task  tjt. 

2.1.5  Deflnitions 

In  the  following,  we  develop  the  necessary  definitions 
for  formalizing  the  execution  model  for  MLS  workflows. 

DeRnition  3  A  workflow  W  can  be  defined  as  a  directed 
graph  whose  nodes  are  the  tasks  in  the  work- 

flow  and  edges  are  the  task  dependencies  U  tj,  where 
ti ,  tj  ^  W  and  X  denotes  the  type  of  dependency.  □ 

DeRnition  4  Two  operations  o,  [d]  and  Oj[d]  conflict 
with  each  other  if  they  operate  on  the  same  data  object 
d  and  at  least  one  of  them  is  a  write.  □ 

Given  a  dependency  t,  tj,  we  say  t,  is  the  parent 
of  tj  and  tj  the  child  oft,. 

DeRnition  5  Given  two  tasks  t,  and  tj  in  W,  ti  is  said 
to  be  an  ancestor  (descendent)  of  tj,  if  t,  is  a  parent 

^Failure  of  a  task  does  not  necesssoily  mean  abort  of  a  task.  A 
task  may  stiU  semantically  fail  even  if  it  successfully  commits. 
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(child)  of  tj  or  ti  is  a  ^parent  (child)  of  tjt  where  th  is  an 
ancestor  ( descendent)  of  tj .  O 

Definition  6  Given  a  workflow  W,  a  potential- state- set 
of  W  (denoted  as  PSS(W))  is  a  set  of  vectors  such  that 
each  vector  (called  potential- state)  in  PSS(W)  repre¬ 
sents  an  allowed  combination  of  final  states  of  all  the 
tasks  in  W.  □ 

As  an  example,  consider  the  workflow  W  = 
and  the  dependency  ti  <2-  PSS(W)  = 

{(cmi,  cm2),  (O’hi,  062),  (ofei,  cm2)}.  Note  that  a  different 
dependency  between  ti  and  <2  may  result  in  an  entirely 
different  PSS(W). 

Definition  7  Given  two  workflows  W  and  W,  we  say 
that  W  covers  W  if  (1)  both  W  and  W  consist  of  the 
same  set  of  data  operations,  (2)  for  every  pair  of  conflict- 
ing  operations  o,[d]  and  Oj[d\,  if  t,-  is  an  ancestor  of  tj  in 
W,  then  ti  must  be  an  ancestor  of  in  W\  and  (3)  for 
each  P  e  PSS(W),  there  exists  a  P'  G  PSS(W')  such 
that  P  CP'.  '  ^ 

2.2  The  Security  Model 

We  assume  the  security  structure  to  be  a  partially 
ordered  set  S  of  security  levels  with  ordering  relation 
<•  A  class  Si  G  is  said  to  be  dominated  by  another 
class  Sj  G  S  if  s,  <  Sj.  A  class  s,-  is  said  to  be  strictly 
dominated  by  another  class  sj  (denoted  as  s,-  <  s,  )  if 
Si  <  Sj  and  i  ^  j 

Let  D  be  the  set  of  all  data  objects.  Each  data  object 
d  c  D  is  associated  with  a  security  level.  Every  task 
ti  in  a  workflow  W  is  associated  with  a  security  level. 
We  assume  that  there  is  a  function  L  that  maps  all  data 
objects  and  tasks  to  security  levels.  That  is,  for  every 
d  E  D.  L(d)  e  S,  and  for  every  task  t,-  G  W,  L(ti)  G  S. 
We  require  every  task  to  obey  the  following  two  security 
properties  —  the  simple  security  and  the  restricted  ★- 
property. 

1.  A  task  ti  is  allowed  to  read  a  data  object  d  only  if 
L{d)  <  L{ti) 

2.  A  task  ti  is  allowed  to  write  to  a  data  object  d  only 
if  L{d)  L{ti). 

In  addition  to  these  two  restrictions,  a  secure  system 
must  preyent  illegal  information  flows  yia  coyert  chan¬ 
nels. 

3  Multilevel  Secure  Workflows 

A  multileyel  secure  (MLS)  workflow  may  consist  of 
tasks  of  different  security  leyels  (as  in  example  1  below). 
Thus,  an  MLS  workflow  consists  of  nodes  at  different 
security  leyels  where  the  dependency  edges  may  connect 
tasks  of  either  the  same  security  leyel  or  different  security 

♦Here  we  made  an  assumption  that  sj  iff  t  j. 


leyels,  which  can  be  distinguished  as  follows.  The  depen¬ 
dency  edge  connecting  tasks  of  the  same  security  level  is 
referred  to  as  intra-level  dependency  and  the  one  con¬ 
necting  tasks  of  different  security  levels  as  inter-level  de¬ 
pendency,  Since  intra-level  dependencies  by  themselves 
cannot  violate  any  multilevel  security  constraints  and  are 
not  different  from  the  task  dependencies  in  a  non-secure 
environment,  hereafter  we  concentrate  only  on  inter-level 
dependencies.  We  further  divide  inter-level  dependencies 
into  two  categories:  high-io-lov^  and  low-to-kigh  since 
their  treatment  has  to  be  different  in  a  MLS  environment 
because  of  its  “no  downward  information  flow”  require¬ 
ment; 


Example  1  Consider  a  workflow  that  computes  the 
weekly  pay  of  all  employees  at  the  end  of  each  week. 
This  process  involves  several  tasks  as  follows.  Task  ti: 
compute  the  number  of  hours  worked  by  an  employee  /i, 
which  is  the  sum  of  regular  hours  worked  (n)  and  over¬ 
time  hours  worked  (o)  by  the  employee  during  that  week. 
Task  ^2*  calculate  the  weekly  pay  of  an  employee  (p)  by 
multiplying  h  with  the  hourly  rate  of  the  employee  (r), 
and  Task  ^3:  after  computing  the  pay  for  the  week,  re¬ 
set  /i,  n  and  o  to  zero.  The  information  about  hourly 
rate  (r)  and  weekly  pay  (p)  is  considered  sensitive,  and 
therefore  both  r  and  p  are  classified  high,  while  the  rest 
of  the  information  is  classified  low.  Since  this  workflow 
involves  write  operations  at  different  levels,  it  is  a  MLS 
workflow. 

According  to  the  two  restrictions  of  our  security 
model,  since  ti  and  t^  write  objects  at  low  (/i,  n  and 
o)  they  must  be  low  tasks,  and  since  <2  reads  the  high 
object  (r)  and  writes  the  high  object  (p),  it  must  be  a 
high  tcisk. 

Moreover,  the  following  task  dependencies  exist:  task 
t2  can  begin  only  after  ti  commits,  thus  ^2,  and  <3 

can  begin  only  after  f  2  commits,  thus  ^2  ^3,  as  shown 

in  f^ure  1.  While  ti  ►  ^2  is  a  low-to-high  dependency, 
t2  h  is  high-io-low.  Thus  it  is  an  MLS  workflow.  □ 


Execution  of  a  workflow  involves  (1)  enforcing  all  task 
dependencies,  (2)  assuring  correct  execution  of  inter¬ 
leaved  workflows,  and  (3)  ensuring  that  the  workflow 
terminates  in  one  of  the  predefined  acceptable  states. 
In  this  paper,  we  focus  on  the  first  part  only. 

It  follows  from  the  no  covert  channel  requirement  that 
for  any  given  task  dependency  t,  — ►  tj,  X(t,)  ^  L(tj). 
That  is,  to  prevent  covert  channels,  no  high-io-low  de¬ 
pendency  must  be  enforced. 

In  a  correct  MLS  workflow  specification,  it  is  not  pos¬ 
sible  to  have  a  high-to-low  value  dependency  because 
enforcing  such  dependency  amounts  to  directly  sending 
data  from  a  higher  to  a  lower  security  level.  The  same 


^Although  we  use  the  term  high-toAow,  this  dependency  also 
includes  those  among  two  incomparable  security  levels. 
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argument  applies  to  the  case  of  a  high-to-low  control  flow 
dependency  with  data  flow. 

With  respect  to  external  dependencies,  for  the  pur¬ 
pose  of  our  work,  we  categorize  them  as  absolute  and  rel¬ 
ative^  where  absolute  dependencies  are  solely  controlled 
by  the  external  factors,  whereas  relative  dependencies 
are  specified  as  external  parameters  but  are  controlled 
by  the  internal  events.  For  example,  “a  task  t,-  can  start 
its  execution  only  at  9:00am, ”  is  an  absolute  external  de¬ 
pendency,  whereas  “task  ij  can  start  its  execution  only 
24hrs  after  the  completion  of  task  4,”  is  a  relative  ex¬ 
ternal  dependency.  We  need  this  classification  because 
enforcement  of  these  two  types  is  different  in  MLS  envi¬ 
ronment.  While  absolute  external  dependencies  can  be 
enforced  without  compromising  security,  relative  exter¬ 
nal  dependencies  may  be  exploited  to  establish  a  covert 
channel,  especially  when  this  dependency  is  from  a  high 
task  to  a  low  task. 

A  relative  external  dependency  is  nothing  but  a  con¬ 
trol  flow  dependency  with  additional  temporal  con¬ 
straints.  Since  temporal  constraints  cannot  be  modeled 
by  simple  graph  structures  but  require  special  modeling 
techniques  that  can  incorporate  external  events,  we  do 
not  consider  them  in  this  paper.  Therefore,  in  this  paper, 
we  consider  only  the  control-flow  dependencies. 

3,1  Execution  Criteria 

In  the  following,  we  define  four  levels  of  execution 
based  on  the  degree  of  security  or  correctness  that  it 
guarantees.  First  we  recall  the  definition  of  secure  exe¬ 
cution  from  [12].  An  execution  is  said  to  be  secure  if  it 
satisfies  the  non-interference  property  [10],  i.e.,  no  lower 
level  task  is  effected  by  any  higher  level  task. 

1.  SSSC-level  (strongly-secure  and  strongly-correct): 
An  MLS  workflow  execution  is  said  to  be  of  SSSC- 
level,  if  it  is  secure  and  all  the  task  dependencies 
are  enforced.  This  calls  for  complete  elimination 
of  covert  channels,  yet  enforcing  all  dependencies. 
This  is  the  most  desirable  case. 

2.  SSWC-level  (strongly-secure  and  weakly-correct): 
All  MLS  workflow  execution  is  said  to  be  of  SSWC- 
level,  if  it  is  secure  but  all  the  task  dependencies 
need  not  be  enforced.  This  requires  complete  elimi¬ 
nation  of  all  covert  channels,  however  one  need  not 
enforce  all  the  task  dependencies. 

3.  WSSC-level  (weakly-secure  and  strongly-correct): 
An  MLS  workflow  execution  is  said  to  be  of  WSSC- 
level,  if  it  enforces  all  the  task  dependencies  but  may 
allow  a  low  bandwidth  covert  channel.  The  band¬ 
width  of  the  covert  channel  can  be  reduced  by  in¬ 
troducing  noise  or  introducing  a  fixed  delay. 

4.  WSWC-level  (weakly-secure  and  weakly  -correct): 
An  MLS  workflow  execution  is  said  to  be  of  WSWC- 


Figure  1:  Task  dependencies  in  the  MLS  workflow  in 
Example  1 
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Figure  2:  Task  dependencies  in  the  MLS  workflow  in 
Example  2 

level,  if  it  does  not  enforce  all  the  task  dependencies, 
yet  allows  covert  channels. 

Although  it  is  desirable  to  have  the  first  level  of  execu¬ 
tion,  this  level  is  difficult  to  achieve  due  to  the  inherent 
conflicts  between  security  and  correctness.  [2]  proposes 
an  approach  to  eliminate  all  high-to-low  task  dependen¬ 
cies.  (i.e.  ensures  SSWC-level  execution).  In  this  paper, 
we  show  how  SSSC-level  of  execution  can  be  attained. 

4  Semantic  Classification  of  Task  Depen¬ 
dencies  in  MLS  Workflows 

In  this  section,  we  take  a  closer  look  at  all  types  of 
dependencies  and  examine  what  they  semantically  mean 
in  an  MLS  environment.  We  give  more  insight  into  each 
type  of  dependency  in  MLS  environment  and  argue  that 
only  some  types  of  dependencies  can  be  specified  in  MLS 
workflows  and  other  types  do  not  exist  in  a  correct  secure 
workflow  specification.  Note  that  our  arguments  focus 
only  on  high-to-low  dependencies  because,  as  we  argue 
in  section  5.1,  low-to-high  dependencies  can  be  enforced 
without  compromising  security  requirements. 

To  reason  about  the  semantics  of  high-to-low  depen¬ 
dencies  (control-flow),  we  first  would  look  at  the  source 
of  this  dependency  and  categorize  them  as  follows: 

1.  This  first  category  of  dependencies  arises  to  force 
the  order  of  (conflicting)  operations  on  shared  data 
objects. 

2.  The  second  category  of  dependencies  arises  to  force 
properties  such  as  atomicity,  mutual  exclusion,  etc. 

Consider  the  task  dependency  i2  in  example 

1.  The  intention  of  this  dependency  is  to  avoid  over¬ 
writing  of  n  and  o  by  ^3  before  t2  reads  them.  Thus, 
the  source  of  this  dependency  is  to  force  a  specific  order 
on  the  conflicting  operations,  and  therefore  belongs  to 
the  first  category.  Dependencies  in  the  second  category 
are  specified  according  to  the  semantics  of  the  workflow. 
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For  example,  the  semantics  require  that  either  one  of  two 
tasks  must  commit  but  not  both  (mutual  exclusion).  For 
instance,  consider  a  travel  reservation  workflow,  where 
two  tasks  are  purchasing  a  ticket  in  Delta  and  in  United, 
where  only  one  task  must  commit  but  not  both.  In  the 
following,  we  provide  an  example  illustrating  such  de¬ 
pendencies  in  an  MLS  environment. 

Example  2  Consider  a  workflow  that  arranges  a  travel 
schedule  for  a  person  P.  Assume  that  P  has  to  first  make 
a  trip  from  Washington  D.C.  to  Toronto  and  then  from 
Toronto  to  Moscow.  The  second  part  of  the  trip  is  on 
a  secret  mission  and  therefore  has  to  be  considered  as 
highly  sensitive  information  and  thus  assumes  high  level. 
However,  the  first  part  of  the  trip  is  not  classified  and 
thus  is  considered  low.  Assume  this  workflow  consists  of 
the  following  four  tasks:  reserving  a  ticket  for  the  first 
part  of  the  trip  (denoted  as  <i),  purchasing  the  ticket  for 
the  first  part  (<2),  reserving  a  ticket  for  the  second  part 
of  the  trip  (<3),  and  purchasing  the  ticket  for  the  second 
part  of  the  trip  (<4),  where  h  and  <2  are  low  level  tasks 
and  <3  and  <4  are  high  level  tasks.  The  following  task  de¬ 
pendencies  exist:  Purchasing  a  ticket  cannot  be  started 
unless  reserving  the  ticket  is  complete.  Thus,  <1  <2 

and  <3  ►  <4.  Moreover,  reserving  a  flight  for  the  sec¬ 

ond  part  of  the  trip  has  to  be  done  only  after  making 
sure  that  the  flight  is  available  for  the  low  part  of  the 

trip,  thus  ti  — <3.  Furthermore,  purchasing  the  ticket 
for  the  low  part  of  the  trip  must  be  committed  only  if 
purchasing  the  ticket  for  the  high  part  of  the  trip  is  suc¬ 
cessful,  thus,  U  *2.  While  the  first  two  task  depen¬ 
dencies  are  intra-level  dependencies,  the  latter  two  are 
low-to-high  and  high-to-low  dependencies,  respectively, 
as  shown  in  figure  2.  q 

The  intention  of  the  high-to-low  dependency  U  ti 
in  the  above  example  is  to  capture  the  semantics  of  the 
workflow  rather  than  forcing  an  order  between  conflicting 
operations.  Thus  this  dependency  belongs  to  the  second 
category. 

If  we  examine  once  again  the  two  types  of  the  source  of 
dependencies  and  analyze  their  effect  on  the  workflow,  we 
can  make  the  following  observation.  Imagine  the  follow¬ 
ing  two  scenarios:  in  the  first,  assume  the  high-to-low  de¬ 
pendency  ti  — » tj  is  enforced,  whereas  in  the  second,  this 
dependency  is  not  enforced.  With  the  second  category 
of  dependency  (e.g.,  U  <2  in  example  2)  the  result  of 
tj  might  be  different  if  f,  tj  is  not  enforced  than  from 
the  case  when  it  is  enforced.  In  other  words,  the  result 
of  tj  might  be  affected  if  t,  tj  is  not  enforced.  How¬ 
ever,  the  dependency  <,■  — » tj  does  not  impact  the  result 
oft,.  Thus  we  call  the  second  category  of  dependencies 
result-dependent  (RD).  On  the  other  hand,  consider  the 
first  category  of  dependencies  (e.g.,  t2  <3  in  example 
1).  If  the  dependency  is  not  enforced,  the  result  of  tj 


will  not  be  affected  but  that  of  f,-  will  be  affected.  This 
can  only  occur  when  the  two  tasks  share  common  data 
in  multilevel  secure  systems  (when  this  dependency  is 
high-to-low).  Thus,  we  call  the  first  category  of  depen¬ 
dencies  conflicting  (CN).  In  the  following,  we  formally 
define  these  categories. 

Definition  8  A  dependency  between  two  tasks  t,-  tj 
is  said  to  be  conflicting  (CN)  if  there  exist  at  least  two 
conflicting  operations  Oi\d\  and  Oj\d\  (i  ^  j);  otherwise 
ti  tj  is  said  to  be  conflict-free  (CF).  D 

On  the  other  hand,  from  the  perspective  of  task  result, 
a  dependency  can  either  be  result-independent  or  result- 
dependent,  Formally: 

Definition  9  A  dependency  t,  jg  g^id  to  be 

result- dependent  (RD)  if  the  result  of  executing  the  child 
is  different  when  the  dependency  is  enforced  from  that 
when  it  is  not  enforced;  otherwise  t,  ^  tj  is  said  to  be 
is  result-independent  (RI).  □ 

The  intuitive  idea  behind  this  classification  is  that  the 
result  of  the  execution  of  either  the  child  (in  case  of  RD) 
or  the  parent  (in  case  of  CN)  must  be  different  when  the 
dependency  is  enforced  than  from  the  case  when  it  is  not 
enforced.  Thus,  there  cannot  be  any  dependency  which 
is  both  conflict-free  and  result-independent. 

Let  us  now  examine  this  categorization  in  the  wake  of 
multiple  security  levels  on  data  items  and  tasks.  At  this 
point,  our  primary  concern  is  how  to  enforce  high-to-low 
dependencies  in  a  MLS  workflow. 

•  Case  CN:  Dependencies  belonging  to  this  category 
indicate  that  the  two  tasks  involved  in  the  depen¬ 
dency  access  some  common  data  items  in  conflict- 
ing  mode.  The  primary  reason  to  enforce  this  type 
of  dependency  is  to  enforce  the  order  of  these  con¬ 
flicting  operations.  This  category  depicts  a  typical 
conflicting  situation  where  the  parent  with  a  read 
operation  is  followed  by  the  child  with  a  write  op¬ 
eration  on  the  same  data  object  (No  other  com¬ 
bination  of  read-writes  are  possible  as  per  our  se¬ 
curity  model).  High-tc^low  CN  dependencies  can 
be  further  classified  into  result-dependent  (RD)  and 
result-independent  (RI)  dependencies,  thus  result¬ 
ing  in  CN-RD  dependencies  and  CN-RI  dependen¬ 
cies.  These  two  categories  of  dependencies  are 
briefly  discussed  below. 

-  CN-RI:  B1  dependencies  mean  that  the  result 
of  the  child  does  not  depend  on  whether  the 
dependency  is  enforced  or  not.  However,  no 
enforcement  of  this  dependency  may  produce  a 
different  result  for  the  parent  task.  Obviously 
the  result  of  the  child  is  independent  of  whether 
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Figure  3:  Categorization  of  high-to-low  dependencies 

the  dependency  is  enforced  or  not.  There¬ 
fore,  all  dependencies  falling  into  this  category 
should  be  non-abortive  such  as  begin  on  com¬ 
mit,  begin  on  abort,  etc.  because  otherwise  the 
result  of  the  child  will  get  affected  (i.e,  abort) 
by  the  parent. 

-  CN-RD:  Dependencies  such  as  force  abort,  ter¬ 
mination,  exclusion  etc.  that  may  cause  the 
child  task  to  get  aborted  (the  abortive  type) 
will  fall  under  RD  category.  Abortive  type  de¬ 
pendencies  are  all  RD  because  without  enforc¬ 
ing  the  dependency,  the  child  task  may  com¬ 
mit  (as  opposed  to  abort).  This  is  because  all 
abortive  dependencies  could  possibly  cause  the 
abortion  of  the  child  and  therefore  are  not  of 
type  RI. 

•  Ciw  CF:  Dependencies  belonging  to  this  category 
ran  only  be  formed  by  pure  semantic  specifica- 
tiiiii.  Although  two  tasks  do  not  conflict,  sometimes 
for  ensuring  an  acceptable  termination  state  of  the 
workflow,  such  dependencies  are  specified. 

—  CF-RD:  These  dependencies  are  specified  in 
such  a  way  that  the  execution  of  the  child  de¬ 
pends  on  the  value  or  outcome  of  the  parent. 
Therefore  altering  the  execution  order  between 
these  two  tasks  would  affect  the  outcome  of  the 
child,  in  other  words,  the  dependency  is  result- 
dependent.  These  dependencies  can  be  either 
strong  or  weak. 

“  CF-RI:  A  CF  type  dependency  does  not  cause 
any  effect  on  the  result  of  the  parent  task. 
Thus,  there  cannot  exist  any  dependency  which 
is  both  CF  and  RI  because  its  presence  neither 
effects  the  parent  nor  the  child. 

Tlie  Venn-diagram  shown  in  figure  3  depicts  these  var- 
it)us  categories  of  high-to-low  dependencies.  The  above 
ratirgorization  of  high-to-iow  dependencies  is  important, 
becatisf  each  category  needs  to  be  handled  according  to 


a  different  approach  in  an  MLS  environment.  The  spe¬ 
cific  approach  used  for  each  category  will  be  presented 
in  the  next  section. 

We  introduce  now  an  algorithm  to  classify  the  high- 
to-low  dependencies  in  a  given  workflow  according  to  the 
above  classification.  The  algorithm  only  needs  to  know 
the  set  of  data  that  will  be  potentially  read  and  written 
by  each  task  and  the  set  of  dependencies  among  tasks. 
Note,  however,  the  approach  we  introduce  in  the  next 
section  still  applies  even  if  a  task  only  reads  (writes)  a 
subset  of  such  set, 

Algoritlmi  1  [Identifying  the  Type  of  Dependency] 

for  every  t,-  in  W  where  L{ij)  <  L(t,), 

/*  for  every  high-to-low  dependency  */ 
if  3r,'[cq  e  ii  and  Wj[<I\  e  tj  where  i  j 
/*  if  two  tasks  are  conflicting  */ 
if  ti  tj  is  abortive 

label  X  with  CN-RD 

/*  abortive  CN  type  dependencies  are  RD  */ 
else 

label  X  with  CN-RI 

/*  non-abortive  CN  type  dependencies  are  RI  */ 

else 

label  X  with  CF-RD 

/*  all  conflicting  free  dependencies  must  be  RD  */ 
endjfor}  O 

5  Execution  of  MLS  Workflows 

Enforcing  a  low-io-high  dependency  will  not  result  in 
violation  of  security.  By  contrast,  a  covert  channel  may 
be  established  while  enforcing  a  high-to-low  dependency. 
The  high-to-low  dependencies  are,  therefore,  much  more 
diflScult  to  handle  than  low-io-high  dependencies.  In  the 
reminder  of  this  section,  we  first  briefly  summarize  two 
possible  approaches  to  enforcing  low-to-high  dependen¬ 
cies.  We  then  discuss  approaches  to  enforcing  high-to-low 
dependencies  which  are  the  focus  of  our  paper. 

5.1  Low-to-High  Dependencies 

Consider  a  task  dependency  t,-  tj  such  that 

Hii)  <  L{tj),  meaning  that  tj  can  begin  only  after  t.’s 
commit.  Enforcing  such  a  dependency  requires  the  use 
of  a  mechanism  by  which  the  higher-level  task  tj  is  ac¬ 
tivated  upon  commit  of  t,-.  Several  approaches  can  be 
devised. 

A  first  approach  is  based  on  the  use  of  triggers.  Un¬ 
der  this  approach,  a  trigger  would  be  incorporated  into 
t,.  Thus  ti  activates  a  trigger  upon  its  commit  at  the 
high  level,  at  which  point  the  high  task  tj  can  begin. 
This  will  not  violate  security  since  it  is  equivalent  to 
writing-up.  To  ensure  that  the  trigger  is  delivered  &om 
low  to  high,  and  then  to  increase  reliability,  this  ap¬ 
proach  can  be  complemented  by  mechanisms  support¬ 
ing  reliable  transfer  of  messages  in  multi-level  systems. 
Recently,  an  approach  called  PUMP  has  been  proposed 


[11]  which  provides  a  reliable  transfer  of  messages  from 
lower  to  higher  levels  with  a  controlled  stream  of  ac¬ 
knowledgments  from  higher  to  lower  levels.  The  PUMP 
mechanism,  however,  may  be  exploited  by  malicious  pro¬ 
grams  as  covert  channels,  even  though  with  a  very  small 
bandwidth.  An  analysis  has  been  carried  out  in  [11]  to 
measure  the  bandwidth  of  the  covert  channel. 

Another  approach  is  bcised  on  testing  a  given  precon¬ 
dition.  Such  precondition  has  to  be  satisfied  to  begin  the 
high  task  tj.  This  can  be  implemented  by  making  the 
high  task  to  read  some  data  at  low  level  and  check  for 
the  satisfaction  of  the  precondition  periodically.  Note 
that  this  approach  does  not  require  a  secure  message 
passing  as  in  the  earlier  approach.  On  the  other  hand,  it 
requires  the  high  task  to  poll  some  low  data  to  test  for 
the  precondition. 

5.2  High-to-Low  Dependencies 

In  the  following  section,  we  present  our  approach  to 
handle  high-to-low  dependencies.  Since  CN-RI  depen¬ 
dencies  are  conflicting,  the  main  issue  is  how  to  synchro¬ 
nize  the  tasks  to  satisfy  the  dependency  without  intro¬ 
ducing  covert  channels.  Our  approach  to  handle  CN-RI 
dependencies  eliminates  the  high-to-low  dependency  by 
sphiting  the  high  level  task.  The  purpose  of  a  CF-RD  de¬ 
pendency  is  to  force  a  low  level  task  to  move  to  a  certain 
state  according  to  the  state  of  a  high  level  task.  Our  ap¬ 
proach  to  handle  CF-RD  compensates  the  low  task  when 
necessary  by  executing  an  inverse  depenndecy.  Finally, 
we  view  CN-RD  as  a  combination  of  CN-RI  and  CF-RD 
since  it  could  be  due  to  the  conflicting  operations  as  well 
as  due  to  the  semantics  of  the  workflow. 

5.2.1  Enforcing  CN-RI  type  high-to-low 
Dependencies 

As  described  in  the  earlier  section,  these  dependen¬ 
cies  arise  if  there  exists  a  high  task  that  must  read  a  low 
data  item  before  it  is  modified  by  another  low  task.  As 
ill  example  1,  the  intention  of  the  dependency  ^2  ~  ^  ^ 

IS  to  prevent  ^3  from  overwriting  the  low  values  of  data 
items  n  and  o  yet  to  be  read  by  ^2*  Since  delaying  <3 
until  t2’s  commit  would  result  in  a  covert  channel,  we 
propose  two  possible  approaches  to  tackle  this  problem. 
The  first  approach  is  based  on  the  multiple  versions  ap¬ 
proach,  presented  in  [7];  the  second  approach  is  new  and 
proposed  by  this  paper. 

Maintaining  Multiple  Versions:  One  may  use  mul¬ 
tiple  versions  of  data  to  cope  with  such  high-to-low  de¬ 
pendencies.  Whenever  a  task  writes  a  data  item  d,  a 
new  version  of  d  is  created,  thus  the  value  yet  to  be  read 
by  a  high  transaction  is  reserved  as  an  older  version  of 
d.  Costich  and  Jajodia  [7]  have  proposed  an  approach  in 
which  they  associate  an  index  with  each  read/ write  op¬ 
eration.  When  a  multilevel  transaction  first  updates  d,  it 
is  indexed  by  1,  the  next  write  to  d  is  indexed  by  2,  and 
so  on,  and  when  it  reads  d,  the  read  operation  is  assigned 


the  same  index  as  that  of  the  previous  write  operation. 
Thus,  this  indexing  is  used  to  preserve  the  dependencies 
by  allowing  a  high  task  (which  they  call  section  of  the 
multilevel  transaction)  to  read  an  appropriate  version  of 
d  in  order  to  enforce  the  dependency.  This  approach 
has  the  major  drawback  of  requiring  a  special-purpose 
multiversioning  concurrency  control  mechanism.  The 
approach  we  propose  here  (described  below)  does  not 
have  such  requirement  and  therefore  can  be  supported 
by  any  standard  DBMS.  A  similar  approach  to  preserve 
the  high-to-low  dependency  has  been  proposed  by  Smith 
et  al.  [15].  This  uses  a  cache  to  save  data  to  be  read  by 
a  high  section  so  that  even  if  a  low  section  overwrites 
the  data  yet  to  be  read  by  the  high  section,  the  depen¬ 
dency  is  still  preserved.  Our  approach  of  splitting  the 
task  (presented  below)  provides  a  framework  to  enforce 
the  high-to-low  dependency  which  can  be  implemented 
using  the  Smith  et  al.’s  caching  scheme. 

Splitting  the  High  Task:  According  to  our  ap¬ 
proach,  first  all  the  operations  in  every  task  are  reordered 
in  such  a  way  that  all  read  operations  on  lower  level  data 
items  occur  before  all  operations  on  data  items  at  the 
level  of  the  task  (Since  all  these  are  read  operations,  it 
does  not  affect  the  correctness  of  the  task.).  Then  the 
task  is  divided  into  partitions  based  on  the  data  items 
it  is  accessing.  For  example,  if  a  high  task  t2  in  exam¬ 
ple  1  is  split  into  two  tasks,  the  first  task  contains 
all  the  low  read  operations,  and  the  second  task  t2  all 
the  high  read/write  operations.  We  introduce  a  begin- 
on-commit  dependency  with  data  flow  ^2  to 

ensure  that  data  read  by  the  read  operations  in  in 
fact  are  carried  over  to  <2  even  in  the  wake  of  other  in¬ 
terfering  tasks.  Then,  to  enforce  ti  ^2  and  <2  fa, 
we  convert  these  two  dependencies  as  U  and 

/  he  ^ 

^2^  — ►  <3,  as  shown  in  figure  5.  The  low  task  <3  pro¬ 
ceeds  only  if  tlr  commits,  thus  preserving  the  high-to- 
low  dependency  between  and  <3.  In  the  following,  we 
formally  present  our  approach. 

Definition  10  Given  a  task  <„  given  s  <  £(t,  ),  we  say 
that  there  exists  a  partition  of  U  if  t?  =  fo.  fdl  p 
t.|£(d)  =  s}^0.  • 

Since  a  task  t,  is  allowed  to  read  and  write  data  at 
its  own  level  as  well  as  read  data  from  lower  levels,  all 
read  operations  pertaining  to  a  lower  level  s  belong  to 
one  partition  (say  <?),  whereas  all  read  and  write  opera¬ 
tions  pertaining  to  the  level  of  the  task  belong  to  another 
partition  (we  use  the  original  identity  t,).  Thus  if  a  task 
reads  from  two  lower  levels  and  reads  or  writes  at  its  own 
level,  it  has  three  partitions. 

Definition  11  Given  two  tasks  <.•  and  tj  in  W,  t,  is  said 
to  be  a  closest-s-ancestor  of  t,  if  (1)  L(tj)  =  s,  (2)  tj  is 
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Figure  4:  An  example  demonstrating  the  closest  ancestor 
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Figure  5:  Modified  Workflow  after  executing  split Jask 
for  Example  1 


an  ancestor  of  t,*,  and  (3)  there  exists  no  tk  in  W  such 
that  L(tk)  =  s  and  ik  is  an  ancestor  of  ti  and  descendent 
of  tj .  □ 

For  example,  in  the  workflow  shown  in  figure  4,  is  and 
ti  are  the  closest-mid-ancestors  of  and  is  is  the  closest- 
low-ancestor  of  By  contrast,  te  is  not  a  closest-mid- 
ancestor  of  ^y  because  there  exists  tg  which  is  an  ancestor 
of  ty  and  a  descendant  of  tg. 

The  following  algorithm  specifies  our  approach  to  task 
splitting. 

Algorithm  2  [split-task] 

for  every  t,  tj  in  W  where  dep  is  CN-RI 

split  ti  into  number  of  partitions  where  each  parti¬ 
tion  ti  =  {o,[x]  E  ti\L{x)  =  s  A  s  <  L{ti)} 
remove  every  operation  o,-  [d\  from  t,-  such  that 
L{d)  <L{ti) 

/*  divide  the  high  parent  such  that  each  partition 

consists  of  operations  involving  access  to  data  items 

of  the  same  security  level  */ 

add  all  to  W 

for  each  such  that  s  <  L(ti) 

add  an  edge  t-  — ►  ti 
/*add  a  be  dependency  with  data-flow  from 
every  lower  level  partition  to  that  at  level  (ti)*/ 
for  every  tjt  in  W  where  tjt  is  the  closest-s-ancestor 
of  ti, 

add  an  edge  th  t,- 
endjfor}; 
end  {for}; 

add  an  edge  tj  such  that  s  =  L(ij) 

end  {for}  □ 

According  to  the  above  algorithm,  given  a  dependency 
U  tj,  we  first  split  t,-  into  number  of  partitions 


based  on  the  security  level  of  data  items  involved  in  each 
operation  of  t,  .  Then  we  add  an  edge  of  type  “be”  (i.e., 
begin-on-commit)  with  data  flow  from  every  lower  level 
partition  (^J)  to  the  highest  level  partition  (t,).  This  is 
to  ensure  that  the  data  read  by  a  low  partition  reaches 
the  high  partition.  Later,  for  each  level  where  there  is  a 
partition  of  U,  we  add  an  edge  of  type  “be”  from  every 
closest  ancestor  th  of  tj  at  that  level  to  that  partition. 
This  edge  is  to  ensure  that  splitting  does  not  remove  any 
dependency  from  th  to  t\ .  We  need  to  take  care  of  the 
case  where  th  and  tf  are  conflicting.  The  abortive  type 
need  not  be  enforced  because  we  need  not  to  preserve  the 
order  of  conflicting  operations  if  tf  is  to  abort  because 
ti  is  always  read-only.  Thus  th  tf  is  always  of  type 
“be.” 

However,  in  the  above  algorithm,  we  do  not  add  an 
edge  from  a  closest-s-ancestor  th  of  f ,  to  a  partition  t* 
where  s'  <  s  because  the  dependency  path  from  th  to 
ti  is  not  meant  to  capture  the  dependency  from  a  lower 
level  s'.  For  example,  in  the  workflow  shown  in  figure  4, 
if  t^*^  exists  and  t!^  does  not  exist,  we  need  to  add  de¬ 
pendencies  ti  and  tg  t^'^  but  do  not  need 

to  add  any  dependency  from  tg  though  it  is  a  closest- 
ancestor  of  ty.  On  the  other  hand,  if  t^*^  does  not  exist 
but  only  tff^  exists,  then  we  need  to  add  only  the  de¬ 
pendency  ^3  t!/*^ .  In  the  last  step,  we  add  an  edge 

from  the  partition  at  the  level  of  tj  to  tj .  This  is  of  the 
same  type  of  the  original  dependency. 

Theorem  1  Let  W  be  a  workflow.  Let  W'  be  the  work- 
flow  obtained  from  W  by  applying  algorithm  2  to  W. 
Then,  IV'  covers  W.  □ 

5.2.2  Enforcing  CF-RD  type  high-to-low 
Dependencies 

As  noted  earlier,  weak  abortive,  strong  non-abortive 
and  weak  non-abortive  dependencies  fall  into  the  CF- 
RD  category.  We  discuss  them  in  separate  sections  as 
the  first  two  require  a  different  treatment  from  the  last 
one. 


Weak  abortive  and  strong  non-abortive  type.  In 
this  section,  we  first  present  a  straightforw^lrd  approach 
employed  in  many  systems,  that  is  based  on  the  use  of 
a  buffer.  It  has,  however,  the  drawback  of  introducing 
some  covert  channel,  even  though  with  a  limited  band¬ 
width.  Then  we  present  our  approach,  based  on  using 
compensating  tasks,  which  does  not  have  such  drawback. 
According  to  our  approach,  sometimes  the  compensating 
tasks  need  to  be  executed  by  a  trusted  subject  such  as 
human  user.  Note  that  this  is  not  necessarily  a  drawbac) 
because  workflow  systems  are  designed  to  support  anr’ 
allow  user  interactions.  Therefore,  requiring  the  inte>: 
vention  of  a  user  is  natural  in  a  workflow  environment 
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Using  a  buffer.  An  approach  is  to  use  a  buffer  at  high 
(assurne  its  size  is  sufficiently  large)  in  which  the  commit 
message  of  the  high  task  is  stored.  This  message  will  first 
be  subject  to  a  delay  of  some  random  duration,  and  then 
will  be  transmitted  to  low.  If  several  such  messages  of  a 
single  workflow  get  accumulated  during  the  delay  period 
of  the  first  message,  these  messages  cannot  be  sent  at  the 
same  time,  but  must  be  sent  individually  with  the  delay 
incorporated  in  between  each  of  them.  Thus,  though 
there  exists  a  channel  of  downward  information  flow,  the 
bandwidth  of  this  channel  would  be  low.  (However,  if 
the  bandwidth  of  the  channel  does  not  exceed  100  bits 
per  second,  then  it  is  fully  secure  (at  B3  or  A1  level).)  It 
is  important  to  note  that  this  approach  might  affect  the 
performance  of  the  system  because  the  signal  from  high  is 
delayed  thereby  delaying  all  the  low  tasks  unnecessarily. 

Running  Compensating  Tasks.  In  this  paper,  we 
propose  an  alternative  approach  to  enforce  a  RD  type 
of  high-io-low  dependency  by  executing  a  compensating 
task.  We  show  below  that  while  some  high-to-low  de- 
pendencies  {strong  non-abortive  and  weak  abortive  type) 
can  be  enforced  by  compensating  a  low  task.  Note  that 
the  low  compensating  task  needs  to  be  initiated  by  a 
trusted  subject.  This  approach,  however,  is  applicable 
only  to  certain  tasks  for  which  there  exist  a  compensat¬ 
ing  task. 

Definition  12  A  task  t,*  is  said  to  be  coTnpensdtable  if 
the  effects  of  its  execution  can  be  semantically  undone 
by  executing  a  compensating  task  □ 

Definition  13  If  there  exists  a  compensating  task 
for  a  task  t,.  then  i^(<“^)  =  L{ti).  *□ 

Definition  14  Given  a  dependency  tj  of  type 

X  we  define  an  inverse  for  x  such  that  for  each 
P  G  PSS(W).  there  exists  a  P'  G  PSS{W')  such  that 
P  £  P^  where 

if  X  is  either  strong  non-abortive  or  weak  abortive, 
^  with  the  dependency  tj  and  W'  = 

with  the  dependency  t,  in  which 

is  a  compensating  task  of  tj,  □ 

In  order  to  derive  such  an  inverse  for  each  dependency, 
we  have  made  the  following  cissumptions. 

Given  a  task  t,  and  its  compensating  task 

1.  ex,  =>  either  abi  or  cm,  5.  ex, 

2.  eXj  — >  cm^  ^  6.  ~  crrii  =  abi 

3.  ^  ex,  =  abi  7.  ^  abi  =  crrii 

4.  ~  ex~^  =  ex,  8.  cm“^  =  a6, 

The  first  assumption  indicates  that  every  task  that 

has  started  its  execution  must  either  commit  or  abort. 
The  second  assumption  states  that  a  compensating  task 
must  always  commit.  The  third  assumption  states  that 
a  task  that  has  not  yet  begun  is  functionally  equivalent 


X 

be 

a 

e 

ba 

serial 

x“^ 

fba 

fba 

sc 

sc 

fba 

Table  1:  Some  RD  type  dependencies  and  their  inverses 

to  (denoted  as  =)  the  state  where  the  t£isk  hcis  started 
its  execution  but  has  aborted.  Assumption  4  says  that  if 
a  compensating  task  has  not  started  its  execution,  then 
this  state  is  same  as  that  of  proceeding  with  the  execu¬ 
tion  of  the  corresponding  task.  Similarly,  cissumption  5 
states  that  the  successful  completion  of  a  compensating 
task  results  in  the  same  state  when  its  corresponding  tcisk 
has  not  started  its  execution.  Assumptions  6  and  7  state 
that  abort  and  commit  are  complements  of  each  other. 
Assumption  8  states  that  the  successful  completion  of  a 
compensating  task  is  functionally  equivalent  to  undoing 
all  the  effects  of  its  corresponding  task.  The  first  five  are 
referred  to  as  the  basic  assumptions  while  the  last  three 
can  be  derived  from  the  basic  assumptions. 

Based  on  the  above  assumptions,  we  can  derive  the 
inverse  dependency  for  any  given  dependency.  In  the 
following  example,  we  show  our  reasoning  about  how 

U  tj  is  equivalent  to  <,■  tj^.  Recall  from  sec- 

tion  2.1.1  that  ti  ij  represents  the  dependency  that 
task  tj  cannot  begin  until  task  t,  commits,  which  implies 
the  following  logical  relationship:  cm,  ^  exj.  It  results 
in  the  following  PSS:  {(cmj,cm,),  (a6j,cm,),  (a6j,a6,)}. 
Note  that  the  first  two  are  due  to  assumption  1  whereas 
the  last  one  is  due  to  assumption  3. 

On  the  other  hand,  t,  states  that  if  t,-  aborts 

then  the  compensating  task  must  begin  its  execu- 
tion.  Therefore  ,  it  reflects  the  following  logical  rela¬ 
tionship:  abi  cXj  ^  which  results  in  the  following 
PSS:  {(a6,,a6j),(cm,-,cmj),(cm,,a6j)}.  Note  that  the 
first  two  are  derived  from  assumptions  2,  3  and  5  whereas 
the  last  by  applying  assumptions  1  and  4.  Since  t,  tj 
and  ti  *■  tj  ^  result  in  the  same  possible  sets,  we  say 

ti  — ^  tj  is  equivalent  to  t,  ^tj^.  Table  1  lists  some 
RD  dependencies  £ind  their  inverses^. 

The  above  form^llism  can  be  used  to  enforce  high-to- 
low  dependencies  as  follows:  For  example,  if  there  exists 
a  high-to-low  dependency  t,  tj,  since  the  inverse  of 
be  is  “fba,”  and  “be”  is  strong  and  non-abortive,  we 
replace  the  above  dependency  with  t,  tj^,  meaning 
that  the  compensating  task  tj^  will  begin  if  U  aborts. 
That  is,  both  t,-  and  tj  can  be  executed  independently, 
thus  tj  need  not  wait  for  t,  thereby  eliminating  potential 
covert  channels.  However,  if  t,  aborts,  a  compensating 
task  tj  will  be  started.  Indeed,  this  compensating  task 

’’Since  the  compensating  tasks  need  to  be  executed  by  a  trusted 
subject  (e.g.,  a  human  user),  the  no-force-commit  assumption  can 
be  ignored  here. 


Figure  6:  Modified  Workflow  after  executing  compen- 
saieAask  for  Example  2 

must  be  executed  by  a  trusted  subject,  e.g.,  a  human 
user.  If  there  are  any  dependencies  involving  tj,  (e.g., 

compensating  tj  requires  tk  to  be  compen¬ 
sated  to  capture  the  cascaded  compensation. 

Since  our  approach  does  not  compromise  security  yet 
can  enforce  equivalent  compensating  dependencies  if  all 
tasks  are  compensatable,  thus  ensures  SSSC-level  execu¬ 
tion.  Figure  6  shows  the  modified  workflow  for  example 
2.  A  proof  and  justification  for  Table  1  can  be  found  in 
[3]. 

The  following  algorithm  shows  how  the  above  formal¬ 
ism  can  be  employed  to  modify  a  high-io-low  CF-RD 
type  of  dependency  by  introducing  a  compensating  task. 

Algorithm  3  [compensate^iask] 

for  every 

if  there  exists  a 
remove  tj  from  W 
if  X  is  strong  non-abortive  or  weak  abortive 
add  a  node  and 

edges  ii  ^  ij^  and  tj  in  W 

/^compensate  the  low  task  and  add  a  new  inverse 
dependency  of  the  original  high-to-low  dependen¬ 
cy  from  parent  to  the  compensating  task  of  child  * / 
for  every  tj  where  ii  is  the  closest- )-ances tor  of 

add  an  edge  ti  tj  in  W 
end {for} 
for  every  tj 

parent  ♦-  j  and  child  k 
execute  cascaded- compensation 
endjfor} 
end{for} 

cascaded- compensation 

for  each  tparent  *  I'child 
if  there  exists  a 

add  a  node  and  an  edge  ^ 

parent  ♦—  child 
endjfor)  □ 

Theorem  2  Let  W  be  a  workflow.  Let  W'  be  the  work- 
flow  obtained  from  W  by  applying  algorithm  3  to  W.  If 
for  every  dependency  U  t^  in  W  where  x  is  strong 


non-abortive  or  weak  abortive,  there  exists  a  t-  ^  and  a 
where  t^^  is  a  descendent  of  ty,  then  covers  W,  D 

Weak  non-abortive  type.  According  to  the  no-force- 
commit  assumption,  the  weak  non-abortive  type  depen¬ 
dencies  in  CF-RD  category  can  only  have  the  following 
scenarios: 

(1)  a  “can”  relationship  on  the  commit  primitive  of  the 
child  task  such  as  commit  dependency  {tj  can  commit  if 
ti  commits). 

(2)  a  “must”  relationship  on  the  begin  primitive  of  the 
child  task  such  as  we€dc  begin-on-commit  dependency  {tj 
must  begin  if  t,  commits). 

In  dealing  with  such  a  high-to-low  dependency,  we 
can  simply  ignore  it  without  degrading  the  correctness 
level  because  these  dependencies  result  in  a  PSS  with 
all  possible  combinations  of  the  states  of  the  two  tasks. 
For  example,  the  weak  begin-on-commit  dependency  im¬ 
plies  the  logic  relationship,  cm,  =>  hj,  which  results  in  a 
PSS  =  {(cm,,cmj),(cm,,a6j),(afr,,cmj),(o6,,a6j)}.  In 
other  words,  it  is  the  scime  as  if  two  tasks  are  without 
any  logic  relationship. 

5«2.3  enforcing  CN-RD  type  high-to-low 
Dependencies 

As  noted  earlier,  we  treat  CN-RD  as  a  combination 
of  CN-RI  and  CF-RD.  Therefore  we  use  both  split- task 
and  compensate- task,  called  split.compensaieAask,  The 
split-compensate.task  approach  consists  of  two  steps.  In 
the  first  step,  the  dependency  is  treated  as  if  it  is  a  CN- 
RI  type  in  which  the  high  task  is  split  using  split -task 
algorithm.  In  the  second  step,  the  CF-RD  dependency 
between  the  high  partition  of  the  parent  task  and  the 
child  task  is  handled  using  the  compensate  .task  algo¬ 
rithm.  Algorithm  4  formally  presents  the  above  illustra¬ 
tion. 

Algorithm  4  [split^compensatejtasl^ 

for  each  control-flow  dependency  (t,-  U)  in  W  such 
that  L{U)  ^  L{tj) 

if  (x  is  of  type  CN-RD),  then 
re-label  x  as  CN-RI; 
execute  algorithm  split Jasb, 
ti  ^  ti  -  tJ; 
add  an  edge  t,  tj 
and  label  x  as  CF-RD; 
execute  algorithm  compensaie^task; 
end{for}  □ 

In  figure  7,  we  summarize  how  each  type  of  depen¬ 
dency  can  be  redesigned. 

5.3  Algorithm  for  the  Execution  of  MLS 
Workflows 

Given  a  workflow  specification  W,  in  the  following,  we 
present  an  algorithm  to  derive  the  a  workflow  execution 


tpUt^eompensatejnsk  compensate  Jask  ignore 


conflicting  (CN)  conflict-free  (CF) 


Figure  7:  Approach  for  redesigning  each  type  of  depen¬ 
dency 


Figure  8:  An  example  workflow  (W) 


Theorem  3  Let  be  a  workflow.  Let  W*  be  the  work- 
flow  obtained  by  applying  algorithm  5  to  W.  Then,  W* 
covers  W,  □ 

The  workflow  execution  graph  thus  constructed  con¬ 
sists  of  all  dependencies  from  low-to-high  unless  it  is  to 
a  compensating  task.  Since  all  compensating  transac¬ 
tions  that  involve  a  high-to-low  dependency  are  executed 
by  trusted  subjects,  enforcing  the  dependencies  in  WEG 
does  not  cause  any  covert  channels.  Figure  9  shows  the 
WEG  for  the  W  in  figure  8. 

Theorem  4  If  W  consists  of  all  CN-RI  type  high-to- 
low  dependencies,  then  the  execution  of  WEG  achieves 
SSSC-level  of  execution.  □ 

Theorem  5  If  there  exists  a  for  every  dependency 

z-  — ►  fj  jji  yy  where  x  is  strong  non-abortive 

or  weak  abortive,  then  the  execution  of  WEG  achieves 
SSSC-level  of  execution.  q 

Theorem  6  If  W  consists  of  all  CN-RD  type  high-to- 
low  dependencies,  then  the  execution  of  WEG  achieves 
SSSC-level  of  execution.  □ 


graph  (WEG)  that  determines  the  execution  order  of  the 
tasks  in  a  workflow.  Here  we  assume  that  if  there  exists 
a  task  dependency  that  is  only  of  one  type. 

Algorithm  5  [Constructing  WEG  from  W] 

nodes  of  WEG  are  tasks  in  W; 

include  all  dependencies  (U  tj)  in  W  as  edges  of 

WEG  such  that  L(t,)  <  L(tj) 

for  each  control-flow  dependency  (U  tA  in  W  such 
that  L(ti)  ^  L{ij) 

if  (x  is  of  type  CF-RD),  then 
if  (x  is  a  weak  abortive  or  strong  non-abortive 
dependency),  then 
execute  algorithm  compensaie^iask; 
else  ignore  x; 

elseif  (i  is  of  type  CN-RI  ),  then 
execute  algorithm  spliiJask; 
elseif  (x  is  of  type  CN-RD  ),  then 
execute  algorithm  splH^coTnpeusaicjtask^ 
end  {for}  □ 


Figure  9:  The  WEG  for  W  in  figure  8 


5.4  Related  Work 

Research  addressing  incorporating  multilevel  security 
in  workflow  management  systems  is  fairly  new.  Recently 
Atluri  and  Huang  in  [2]  have  proposed  a  Petri  net  based 
approach  which  can  automatically  detect  and  prevent  all 
task  dependencies  that  can  potentially  cause  covert  chan¬ 
nels.  Since  their  approach  eliminates  some  dependencies, 
it  cannot  guarantee  correct  execution  of  multilevel  secure 
workflows. 

Several  researchers  have  addressed  issues  concerning 
execution  models  for  multilevel  transactions,  which  are 
relev^t  to  our  work.  Unlike  traditional  transactions,  a 
multilevel  transaction  can  read  as  well  as  write  at  multi¬ 
ple  security  levels.  Multilevel  transaction  execution  can¬ 
not  meet  both  atomicity  and  secrecy  requirements  be¬ 
cause  aborting  a  portion  of  the  transaction  at  a  lower 
security  level  due  to  the  abort  of  its  higher  level  coun¬ 
terpart  creates  information  flows  that  violate  multilevel 
security  restrictions.  Since  multilevel  transactions  can 
be  modeled  in  our  workflow  framework,  the  solutions 
we  propose  in  this  paper  are  applicable  to  this  area  as 
well.  Some  of  the  earlier  solutions  deal  with  this  prob¬ 
lem  by  relaxing  the  atomicity  requirements.  For  example 
Blaustein  et  al.  [5]  have  proposed  several  levels  of  atom¬ 
icity,  and  show  that  based  on  the  structure  of  the  mul¬ 
tilevel  transaction,  only  a  certain  level  of  atomicity  can 
be  achieved.  They  have  proposed  two  algorithms,  called 
Low^Firsi  and  High^Ready-Wait  In  the  Low^First,  single 
level  portions  (called  sections)  are  executed  in  the  order 
of  increasing  security  level.  That  is,  all  lower  level  sec¬ 
tions  must  be  executed  and  committed  before  a  higher 
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level  section  starts  execution.  Thus,  this  algorithm  can¬ 
not  allow  high-to-low  dependencies.  Thus,  Low-First 
can  make  no  guarantees  on  the  level  of  atomicity.  In 
High-Ready- Wait,  all  sections  of  a  multilevel  transaction 
are  executed  (but  not  committed)  in  a  high  to  low  or¬ 
der  and  then  committed  in  a  low  to  high  order.  Thus, 
High-Ready- Wait  cannot  enforce  low-to-high  dependen¬ 
cies.  Moreover  it  works  for  only  hierarchically  ordered  se¬ 
curity  structures  and  also  may  cause  a  limited  bandwidth 
covert  channel.  Thus,  Blaustein  et  al.’s  approach  works 
only  if  either  all  dependencies  are  either  low-to-high  or 
high-to-low  but  does  not  work  if  there  exist  both  high- 
to-low  and  low-to-high  (which  is  referred  as  cycles  in  [5], 
Our  redesigning  approach  can  be  employed  to  eliminate 
some  of  the  high-to-low  dependencies  thus  increasing  the 
degree  of  atomicity  that  can  be  guaranteed.  Note  that 
earlier  researchers  have  also  proposed  techniques  for  the 
elimination  of  high-to-low  dependencies  between  sections 
of  the  multilevel  transaction  by  rewriting  the  section  [5] 
or  maintaining  multiple  versions  of  data  [7],  but  their 
rewriting  of  each  section  requires  a  careful  examination 
of  the  semantics  of  the  transaction  by  a  human  user, 
and  moreover  may  not  be  possible  in  all  cases.  Whereas 
our  approach  redesigns  a  workflow  by  simply  examining 
the  read  and  write  operations  of  the  tasks  and  there¬ 
fore  can  be  fully  automated.  Our  approach  is  similar  to 
the  cache  scheme  proposed  by  Smith  et  al,  [15].  Later, 
Ammann  et  al.  [1]  have  also  proposed  a  solution  based 
on  semantic  atomicity  which  again  requires  rewriting  of 
multilevel  transactions  manually.  Moreover,  Ammann 
et  al.’s  approach  is  based  on  the  assumption  that  every 
dependency  from  a  higher  to  a  lower  level  task  can  be 
coverted  into  a  lower  to  a  higher  level.  In  this  paper, 
we  characterize  all  the  types  of  dependencies  and  show 
that  only  certain  types  (called  conflicting  in  section  4) 
can  be  converted  in  such  a  way.  Another  advantage  of 
modeling  a  multilevel  transaction  as  a  workflow  transac¬ 
tion  model  is  that  it  allows  one  to  distinguish  the  various 
types  of  dependencies  that  can  occur  among  the  sections 
of  a  multilevel  transaction.  This  allows  one  to  identify 
the  sections  that  require  to  be  executed  atomically  in¬ 
stead  of  the  entire  transaction  thereby  allowing  one  to 
specify  relaxed  atomicity  requirements. 

6  Conclusions 

Correct  execution  of  multilevel  secure  workflow  re¬ 
quires  enforcing  all  the  task  dependencies.  However, 
ensuring  high-to-low  dependencies  is  dfficult  because 
of  the  inherent  conflicts  between  security  and  correct¬ 
ness.  In  this  paper,  we  show  how  a  multilevel  secure 
workflow  can  be  executed  in  a  secure  €ind  correct  man¬ 
ner.  Our  approach  is  based  on  semantic  classification 
of  the  task  dependencies  that  examines  the  source  of 
the  task  dependencies.  We  propose  algorithms  to  au¬ 
tomatically  redesign  the  workflow  in  such  a  way  that 
all  task  dependencies  can  be  executed  without  compro¬ 


mising  security.  Note  that  execution  of  workflows  can 
be  executed  by  untrusted  commercially  available  work- 
flow  management  systems  although  the  redesign  algo¬ 
rithm  must  be  trusted.  For  details  of  our  system  ar¬ 
chitecture,  the  reader  may  refer  to  [3].  Our  solutions  are 
directly  applicable  to  another  relevant  cirea  of  research  — 
execution  of  multilevel  transactions  in  multilevel  secure 
databases  since  the  atomicity  requirements  and  other  se¬ 
mantic  requirements  can  be  modeled  as  a  workflow.  By 
modeling  a  multilevel  transaction  as  a  workflow  trans¬ 
action  model  allows  one  to  distinguish  the  various  types 
of  dependencies  that  can  occur  among  the  sections  of  a 
multilevel  transaction.  This  allows  one  to  identify  the 
sections  that  require  to  be  executed  atomically  instead 
of  the  entire  transaction  thereby  allowing  one  to  specify 
a  relaxed  atomicity  requirements.  Our  redesign  process 
can  be  used  to  increase  the  degree  of  atomicity  one 
guarantee  for  a  multilevel  transaction.  Note  that  un¬ 
like  prior  research  in  this  area  that  requires  the  redesign 
based  on  the  semantics  and  therefore  requires  a  careful 
examination  by  a  human,  our  approach  can  be  fully  au¬ 
tomated. 
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A  Proofs 

Theorem  1  Let  W  be  a  workflow.  Let  W  be  the  work- 
flow  obtained  from  W  by  applying  algorithm  2  to  W. 
Then.  W'  covers  W. 

Proof;  Algorithm  2  keeps  all  the  t,  t-  jepen- 

dcncie.^  intact.  I^us  to  proye  this  theorem  we  need  to 
consider  only  f,  ^e  proye  that  W  covers 

W  by  proving  all  the  three  conditions  in  definition  7  are 
true  for  all  CN-RI  type  dependencies. 


Since  all  the  operations  removed  from  t,-  while  split¬ 
ting  it  are  placed  in  some  lower  level  partition  t*  by  al¬ 
gorithm  2  condition  1  of  definition  7  is  trivially  true. 

xcw  condition  2  of  definition  7.  Let 

^  a  dependency  in  W.  To  prove  condition  2 

of  definition  7,  we  should  prove  that  applying  algorithm 
2  does  not  change  the  order  of  the  operations  conflicting 
with  those  of  t,-  that  are  either  in  itself  or  that  belong  to 
an  ancestor  or  descendent  of  t,-.  According  to  algorithm 
2,  ti  is  split  into  partitions  as  per  definition  10.  Because 
algorithm  2  adds  a  dependency  t*  — ^  t,*,  from  every 
t,-.  where  s  <  L{ti),  the  order  of  conflicting  operations 
within  the  original  tf  are  preserved  in  W\ 

Because  oft,-  — .  ij,  <,•  is  an  ancestor  (in  fact  par¬ 
ent)  of  tj.  Thus  all  operations  of  t,-  precede  those  of 
tj  in  W.  According  to  algorithm  1,  /,  must  consist  of 
at  least  one  operation  r<[d]  such  that  L{d)  =  L(tj)  and 
there  must  exist  a  in  tj.  Thus,  ^  0.  Because 
algorithm  2  adds  an  edge  tf  tj,  it  implies  that 
is  an  ancestor  of  tj,  i.e.,  all  operations  of  precede 
those  of  tj.  Thus  W'  preserves  the  order  of  conflicting 
operations  between  t,-  and  tj.  If  there  is  any  child  t*  to 
ti,  other  than  ij,  then  splitting  t,-  will  not  change  the 
order  of  conflicting  operations  oft*  and  t.  because  these 
operations  will  only  be  in  t,  (since  all  other  partitions 
consist  of  read  only  operations  £md  if  there  is  any  other 
CN  type  dependency,  that  should  have  been  expressed 
as  another  dependency)  and  all  the  other  dependencies 

*  ik  flre  not  affected  while  splitting. 

Now  we  prove  algorithm  2  preserves  the  order  of  con¬ 
flicting  operations  of  t,  and  its  ancestors.  Suppose  is 
an  ancestor  at  level  s  <  L(ti).  Then  U  conflicts  with  t* 
only  if  it  has  a  write  operation  on  a  data  object  d  {L{d) 
must  be  equal  to  s)  and  t,-  has  a  read  operation  involving 
the  same  d.  That  means  0.  A  dependency  4  <?, 

would  preserve  the  order  of  the  conflicting  operations  of 
ti  and  <*.  However,  if  there  exists  .mother  ancestor  of  t,- 
at  level  s,  say  </,  such  that  fj  is  a  descendent  of  <*,  then 
a  dependency  t,  f.  is  enough,  instead  of  4  -1.  tf, 
to  preserve  the  order  of  the  conflicting  operations  among 
tk  and  ti  as  well  as  t/  and  t,-.  Applying  the  same  logic, 
we  can  conclude  that  the  order  of  conflicting  operations 
is  preserved  if  we  add  a  dependency  ti  ,  where  tj 

is  the  closest-s-ancestor  of  Since  t,  does  not  contain  a 
conflicting  operation  with  any  other  partition  tf  where 
s  s'  <  L{ti),  it  is  not  necessary  to  add  a  dep'endency 
from  ti  to  such  tf  .  Thus  we  prove  condition  2  of  defini¬ 
tion  7. 

All  the  tasks  in  W  are  also  in  W.  Moreover  the  addi¬ 
tional  tasks  added  by  algorithm  2  are  simply  read-only 
tasks,  thus  they  do  not  affect  the  commit  or  abort  of  the 
original  tasks.  Thus  PSS(Pt^)  =  P55(ir').  This  proves 
condition  2  of  definition  7.  Therefore  W  covers  W.  □ 
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Theorem  2  Let  Pf  be  a  workflow.  Let  W'  be  the  work- 
flow  obtained  from  W  by  applying  algorithm  3  to  W.  If 
for  every  dependency  t,-  — ►  tj  in  W  where  x  is  strong 
non-abortive  or  weak  abortive,  there  exists  a  and  a 
^  where  t*  is  a  descendant  of  tj,  then  W  covers  W. 


Proof:  Since  algorithm  3  does  not  remove  any  opera¬ 
tions  from  any  task,  condition  1  of  definition  7  is  trivially 
satisfied.  In  case  of  CF-RD  type,  since  t,  and  tj  do  not 
have  any  conflicting  operations,  condition  2  of  definition 
7  is  always  true. 


We  prove  condition  3  of  definition  7  as  follows:  For 
every  t,  — ►  tj,  we  assume  there  exist  descendants  oft,-, 

. 4,,  ...,  ti„_,  tk„.  We 

use  the  following  property. 

PSS{ti.tj,tk, . tkj  -  PSS{ti,tj)  U  PSS{tj,tk,)  U 


Consider  the  Ccise  where 
abortive  or  weak  abortive, 
ers  PSS{t.,tj).  Similarly, 
PSSitj.tk,) 

and  .so  on.  Thus  PSS(ti,tj,tk 

PSS{t,.tj,tk, . tfcj. 


X  is  either  strong  non- 
PSS(ti,tj,tJ^).  cov- 
PSS(tj,tk,,t^^)  covers 

,,ti^^  ..  .,tk„,  t^^)  covers 
D 


rithm  3,  every  t,-  — >  tj  high-to-low  dependency  is 

removed  and  a  new  high-to-low  dependency  t,-  — — ►  t~^ 
is  introduced  in  W EG.  This  new  dependency  can,  how¬ 
ever,  be  added  only  if  tj  ^  exists.  Since  this  new  depen¬ 
dency  is  enforced  only  by  a  trusted  subject  it  does  not 
introduce  any  covert  channels.  Thus  execution  of  W EG 
achieves  SSSC-level  execution.  O 

Theorem  6  If  W  consists  of  all  CN-RD  type  high-to- 
low  dependencies,  then  the  execution  of  WEG  achieves 
SSSC-level  of  execution. 

Proof:  Since  in  WEG,  all  the  CN-RD  high-to-low  de¬ 
pendencies  can  be  broken  down  into  a  CN-RI  type  and 
a  CF-RD  type.  By  theorem  4  and  theorem  5,  the  exe¬ 
cution  of  WEG  that  contains  all  CN-RD  type  achieves 
SSSC-level  execution.  n 


Theorem  3  Let  W  be  a  workflow.  Let  W  be  the  work- 
flow  obtained  by  applying  algorithm  5  to  W.  Then,  W 
covers  W.  ' 


Proof:  This  trivially  follows  from  theorems  1  and  2. 
This  IS  because  algorithm  2  does  not  add  any  new  CF- 
RD  tyjur  dependency,  algorithm  3  and  algorithm  4  do 
not  add  any  new  CN-RI  type  dependency,  thus  this  pro- 
cess  will  not  be  cyclic.  q 

Theorem  4  If  W  consists  of  all  CN-RI  type  high-to- 
low  dependencies,  then  the  execution  of  WEG  achieves 
SSSC-level  of  execution. 

Proof:  Since  in  WEG,  all  the  CN-RI  high-to-low  depen¬ 
dencies  are  converted  by  algorithm  2  into  low-to-high  de¬ 
pendencies,  all  dependencies  can  be  enforced  without  in¬ 
troducing  any  covert  channels.  Thus  execution  of  WEG 
achieves  SSSC-level  execution.  □ 

Theorem  5  If  there  exists  a  t~^  for  every  dependency 

.  rCF^RD  . 

^  W  in  case  where  x  is  strong  non-abortive 

or  weak  abortive,  then  the  execution  of  WEG  achieves 
SSSC-level  of  execution. 

Proof:  Since  all  the  CN-RI  high-to-low  dependencies 
are  converted  by  algorithm  2  into  low-to-high  dependen¬ 
cies.  all  CN-RI  dependencies  can  be  enforced  without 
introducing  any  covert  channels. 

If  X  is  not  weak  and  non-abortive,  according  to  algo- 
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ABSTRACT 

In  this  paper,  we  develop  a  new  paradigm  for  access  control  and  authorization  management,  called  task- 
based  authorization  controls  (TBAC).  TBAC  is  particularly  suited  for  emerging  models  of  computing.  In 
particular,  this  includes  distributed  computing  and  information  processing  activities  with  multiple  points  of 
access,  control,  and  decision-making.  TBAC  articulates  security  issues  at  the  application  and  enterprise 
level.  such,  it  takes  a  task-oriented  perspective  rather  than  the  traditional  subject-object  one.  Access 
mediation  now  involves  authorizations  at  various  points  during  the  completion  of  tasks  in  accordance  with 
some  application  logic.  In  contrast,  the  subject-object  view  of  access  control  typically  divorces  access 
mediation  from  the  larger  context  in  which  a  subject  performs  an  operation  on  an  object.  By  taking  a  task- 
oriented  view  of  access  control  and  authorizations,  TBAC  lays  the  foundation  for  research  into  a  new 
breed  of  “active"  security  models.  TBAC  has  broad  applicability  ranging  from  access  control  for  fine¬ 
grained  activities  such  as  client-server  interactions  in  a  distributed  system,  to  coarser  units  of  distributed 
applications  and  workflows  that  cross  departmental  and  organizational  boundaries.  Furthermore,  the 
ideas  in  TBAC  conform  the  basis  for  enterprise-ijriented  policy  modeling  and  enforcement  tools. 


1.  Introduction 

In  this  paper,  we  describe  a  new  paradigm  for  access  control  and  security  models,  called  task-based 
authorization  controls  (TBAC)  that  is  particularly  suited  for  emerging  models  of  computing.  In  particular, 
this  includes  distributed  computing  and  information  processing  activities  with  multiple  points  of  access, 
control,  and  decision  making  such  as  that  found  in  workflow  and  distributed  process  management  systems. 

TBAC  differs  from  traditional  access  controls  and  security  models  in  many  respects.  Instead  of  having  a 
system-centric  view  of  security,  TBAC  approaches  security  modeling  and  enforcement  at  the  application 
and  enterprise  level.  Secondly  TBAC  lays  the  foundation  for  a  new  breed  of  what  we  call  “active”  security 
models.  By  active  security  models,  we  mean  models  that  approach  security  modeling  and  enforcement  from 
the  perspective  of  activities  or  tasks,  and  as  such,  provide  the  abstractions  and  mechanisms  for  the  active 
runtime  management  of  security  as  tasks  progress  to  completion.  Such  a  task-based  approach  to  security 
represents  a  radical  departure  from  classical  passive  security  models  such  as  those  based  on  one  or  more 
variations  of  the  subject-object  view  of  security  and  access  control.  In  a  subject-object  view  of  security,  a 
subject  is  given  access  to  objects  in  a  system  based  on  some  permissions  (rights)  the  subject  possesses. 
However,  such  a  subject-object  view  typically  divorces  access  mediation  from  the  larger  context  (such  as 
the  current  state  of  tasks)  in  which  a  subject  performs  an  operation  on  an  object. 

Our  focus  in  this  paper  is  on  active  security  mpdels  for  authorization  management  and  access  control  in 
computerized  information  systems.  An  authorization  is  an  approval  act  and  manifests  itself  in  the  paper 
world  as  the  act  of  signing  a  form.  Typically,  in  the  paper  world,  an  authorization  results  in  the  enabling  of 
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one  or  more  activities  and  related  permissions.  The  person  granting  the  authorization  usually  takes 
responsibility  for  the  actions  that  are  authorized  by  the  authorization.  Also,  an  authorization,  as  represented 
y  a  signature,  has  a  lifetime  associated  with  it  during  which  it  is  considered  valid.  Once  an  authorization 
becomes  invalid,  organizations  require  that  the  associated  permissions  no  longer  be  available.  As  paper- 
based  systems  become  computerized,  the  related  authorization  procedures  will  have  to  become  automated. 
Thus  the  TBAC  approach  described  in  this  paper  was  motivated  by  this  anticipated  need  to  automate 
authonzation  and  related  access  controls.  In  particular,  the  implementation  of  TBAC  ideas  will  lead  to 
systems  that  provide  tighter  just-in-time  need-to-do  permissions.  Also,  the  TBAC  approach  leads  to  access 
control  models  that  are  self-administering  to  a  great  extent,  thereby  reducing  the  overhead  typically 
associated  with  fine-grained  subject-object  security  administration. 

To  motivate  further  the  need  for  active  TBAC  security  controls,  consider  a  typical  workflow-like  electronic 
procedure  for  preparing  and  submitting  a  patent  application.  For  the  purpose  of  our  discussion,  we  can  think 
of  a  workflow  as  a  partially  ordered  set  of  tasks  where  each  task  may  involve  one  or  more  participants  and 
invoked  applications,  and  also  issues  one  or  more  operations  to  access  and  manipulate  various  objects  The 
mam  activities  in  the  patent  workflow  are  shown  in  Figure  1 . 


•Authorize- 

pateni-officer 


•Authorize- 

scientific-officer 


Figure  1.  A  workflow  to  process  a  patent  application 


The  first  activity  involves  someone  or  a  research  group  writing  up  the  idea  for  the  patent.  This  will  require 
read  and  write  permissions  to  a  set  of  electronic  documents.  Once  the  writing  is  done,  someone,  such  as  the 
head  of  the  research  group,  grants  an  authorization  to  review  the  documents.  This  authorization  does  two 
things.  First,  it  gives  a  patent  officer  and  a  scientific  officer  the  read  and  write  permissions  required  to 
review  and  make  corrections.  Second,  it  revokes  the  write  permissions  the  original  authors  had  so  that  they 
cannot  make  any  changes  to  the  documents  while  the  reviews  are  being  done.  Once  these  officers  have 
completed  the  reviews  they  grant  further  authorizations  which  enable  the  workflow  to  progress  to  the 
preparation  of  complete  patent  application  package.  At  this  point,  the  original  authors  whose  write 
permissions  \vere  revoked,  may  be  granted  write  permissions  again  temporarily  to  make  additional 
corrections.  After  some  deadline,  the  authors  will  be  prevented  from  making  any  more  modifications  and 
the  workflow  then  progresses  to  the  fee  processing  activity.  Someone,  possibly  in  the  finance  or  accounts 
department  verifies  that  there  are  enough  funds  in  an  account  to  pay  for  the  patent  application  and  then 
authorizes  payment.  This  authorization  grants  an  accounting  clerk  a  one-time  permission  to  debit  the 
required  amount  of  money  from  the  account  and  then  to  issue  a  check.  Finally,  the  workflow  progresses  to 
the  last  activity  which  is  the  filing  of  the  application.  Someone  authorizes  a  specific  courier  to  carry  the 
application  to  the  patents  office  or  alternatively  authorizes  the  electronic  filing  of  the  application. 


In  a  paper-based  implementation  of  the  above  workflow,  the  relevant  authorizations  manifest  as  signatures 
on  various  forms  as  they  are  routed  through  various  departments.  At  each  point,  responsible  parties  are 
required  to  manually  inspect  forms  to  ensure  that  the  prerequisite  signatures  (authorizations)  are  present  so 
that  the  whole  procedure  conforms  to  appropriate  organizational  policies  and  checks  and  balances. 
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However,  when  such  workflows  are  automated,  we  need  an  active  approach  to  authorization  management. 
The  granting,  usage  tracking,  and  revoking  of  permissions  needs  to  be  automated  and  coordinated  with  the 
progression  of  the  various  tasks.  Without  active  authorization  management,  permissions  will  in  most  cases 
be  “turned  on”  too  early  or  too  late  and  will  probably  remain  “on”  long  after  the  workflow  tasks  have 
terminated.  This  opens  up  vulnerabilities  in  systems.  Any  attempt  to  minimize  such  vulnerabilities  will 
require  a  security  administrator  to  keep  track  of  the  progress  of  the  tasks  for  all  enacted  workflow  instances; 
an  error-prone  and  impossible  task!  Thus  what  is  needed  is  an  approach  where  access  control  permissions 
are  granted  and  revoked  according  to  the  validity  of  authorizations  and  one  where  this  can  be  done  without 
manual  security  administration.  The  authorizations  themselves  are  of  course  processed  strictly  according  to 
some  application  logic  and  policy.  In  the  remaining  sections  of  this  paper  we  will  describe  how  TBAC  ideas 
can  be  used  to  accomplish  this. 

There  are  basically  two  broad  objectives  guiding  our  research  efforts  in  TBAC.  The  first  is  to  model  from 
an  enterprise  perspective,  various  authorization  policies  that  are  relevant  to  organizational  tasks  and 
workflows.  We  envision  a  set  of  user-friendly  tools  to  help  a  security  officer  model  and  specify  policies. 
Our  second  objective  is  to  seek  ways  in  which  these  modeled  policies  can  be  automatically  enforced  at 
runtime  when  the  corresponding  tasks  are  invoked.  We  limit  the  discussion  in  this  paper  to  the  core 
concepts  in  TBAC  that  form  our  conceptual  framework.  Various  aspects  of  our  research  such  as  languages 
to  model  authorization  policies,  as  well  as,  the  runtime  mapping  of  these  policies  to  enforcement 
mechanisms  are  topics  of  ongoing  investigation  and  will  be  reported  in  subsequent  publications.  We  also  do 
not  address  the  TCB-style  issues  related  to  assurance,  as  our  focus  at  this  point  is  not  on  implementation. 

Preliminary  ideas  for  TBAC  that  recognized  the  need  for  active  security  were  presented  in  [3]  and  [4]. 
More  recently,  a  workflow  authorization  model  (WAM)  was  presented  in  [10].  WAM  has  the  same  general 
motivation  as  TBAC  in  that  it  tries  to  provide  some  notions  of  active  security  and  just-in-time  permissions. 
However,  from  a  conceptual  standpoint,  TBAC  is  significantly  different  and  more  comprehensive  than 
WAM.  In  WAM  an  authorization  is  a  more  primitive  concept  and  represents  the  fact  that  a  subject  has  a 
privilege  on  an  object  for  a  certain  time  interval.  In  TBAC  an  authorization  (step)  has  much  richer 
semantics  as  it  models  the  equivalent  of  an  authorization  in  the  paper  world.  An  authorization  act  in  the 
paper  world  may  result  in  the  granting  of  several  related  permissions.  Thus  in  TBAC  an  authorization-step 
is  a  convenient  abstraction  to  model  and  manage  a  set  of  related  permissions.  TBAC  also  provides  features 
such  as  usage  tracking  of  permissions,  lifecycle  management,  the  ability  to  put  permissions  temporarily  on 
hold  without  invalidating  them,  as  well  as  modeling  sets  of  authorizations  through  composite  authorizations. 


2.  Background:  From  Passive  Subject-object  Controls  to  Active  Task-based  Security 


In  this  section  we  discuss  how  TBAC  differs  from  the  traditional  subject-object  view  of  access  control. 


2,1  The  subject-object  view  of  access  control 

Figure  2  illustrates  the  traditional  subject-object  view  of  access  control.  In  the  subject-object  view,  the  basic 
entities  are  subjects,  objects,  and  the  rights  possessed  by  subjects  to 
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Figure  2.  The  subject-object  view  of  access  control 
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gam  access  to  the  various  objects.  This  can  conceptually  be  represented  in  an  access  control  matrix  such  as 
t  at  jn  Figure  2.  The  horizontal  and  vertical  projections  of  this  matrix  can  be  implemented  in  systems  as 
capabilities  or  access  control  lists,  respectively.  From  the  standpoint  of  security  models,  the  subject-object 
view  of  access  control  can  be  traced  to  the  earlier  security  models  such  as  the  HRU  model  [81  and  its 
influence  can  be  seen  even  in  later  work  such  as  the  typed  access  matrix  model  (TAM)  [9]. 

A  closer  examination  of  the  subject-object  view  of  access  control  will  reveal  the  following  characteristics. 

The  implicit  assumption  that  there  is  a  central  pool  of  resources  to  which  we  need  to  provide  access 
control. 

•  Access  control  information  represents  isolated  units  of  security  information. 

•  Access  mediation  is  divorced  from  larger  operation  context. 

•  There  is  no  memory  of  any  evolving  context  associated  with  past  accesses. 

•  There  is  no  record  of  the  usage  of  permissions. 

•  Existing  permissions  can  be  revoked  but  cannot  be  put  on  hold. 

•  Requires  fine-grained  security  administration. 


In  sunraary,  the  subject-object  paradigm  of  access  control  takes  a  very  system-centric  view  of  protecting  a 
central  pool  of  resources.  It  enforces  a  very  simple  access  control  discipline  which  can  succinctly  be  stafed 
as.  If  a  subject  has  requested  an  access  operation  to  an  object,  and  the  subject  possesses  the  permission  for 
the  operation,  then  grant  the  access.  Thus  all  the  access  decision  function  has  to  check  is  if  the  subject  has 
the  required  permission.  However  this  simplicity  is  precisely  the  limitation  of  subject-object  access 
ntro  s.  No  other  contextual  information  about  ongoing  activities  or  tasks  can  be  taken  into  account  when 

hPPn  attempts  at  access  control  frameworks  to  overcome  this  limitation  have 

en  discussed  in  [14]  and  [15]).  Further,  there  is  no  record  of  the  usage  of  individual  permissions.  So  long 

T-w.  operation  any  number  of  times.  We  thus  consider  this  to  be 

a  passive  model  of  security.  Next  we  discuss  how  TBAC  forms  an  active  approach  to  access  control 


2.2  TBAC  as  an  active  security  model  for  authorization  management 

Characterize  models  that  recognize  the  overall 
context  in  which  security  requests  arise  and  take  an  active  part  in  the  management  of  security  as  it  relates  to 
the  progress  and  emerging  context  within  tasks  (activities).  Before  we  elaborate  further  on  TBAC  as  an 
active  mode!  let  us  discuss  some  of  the  basic  ideas  in  the  TBAC  approach. 


Figure  3.  An  authorization-step  as  an  abstraction  that  groups  trustees  and  permissions 
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One  of  the  most  fundamental  abstractions  in  TBAC  is  that  of  an  authorization-step.  It  represents  a  primitive 
authorization  processing  step  and  is  the  analog  of  a  single  act  of  granting  a  signature  in  the  forms  (paper) 
world.  From  the  standpoint  of  modeling,  it  is  an  abstraction  that  groups  trustees^  and  various  sets  of 
permissions,  as  illustrated  in  Figure  3.  In  the  paper  world,  a  group  of  individuals  may  be  potentially  allowed 
to  grant  a  certain  type  of  signature.  For  example,  all  sales  clerks  may  be  allowed  to  sign  sales  orders. 
However,  a  single  instance  of  a  signature  may  be  granted  only  by  a  single  individual.  For  example,  sales 
order  1208  is  signed  by  sales  clerk  Tom.  Similarly,  in  TBAC  we  associate  an  authorization-step  with  a 
group  of  trustees*  called  the  trustee-set.  One  member  of  the  trustee-set  will  eventually  grant  the 
authorization-step  when  the  authorization-step  is  instantiated.  We  call  this  trustee  the  executor-trustee  of  the 
step.  The  permissions  required  by  the  executor-trustee  to  invoke  and  grant  the  authorization-step  make  up  a 
set  of  permissions  called  executor-permissions:  Also,  in  the  paper  world,  a  signature  also  implies  that 
certain  permissions  are  granted  (enabled).  In  a  similar  fashion,  we  model  the  set  of  permissions  that  are 
enabled  by  every  authorization-step.  These  permissions  comprise  the  enabled-permissions  set.  Collectively, 
we  refer  to  the  union  of  the  executor-permissions  and  enabled-permissions  as  the  protection-state  of  the 
authorization-step.  Finally,  the  authority  granted  by  a  signature  is  good  only  for  a  limited  period  of  time. 
Similarly,  we  associate  a  period  of  validity  and  a  lifecycle  with  every  authorization-step. 

Classical  subject-object 

access  control  P  c  S  X  O  X  A 


TBAC  view  of  access 

control  PcSXOXAXUXAS 

I _ I 
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Figure  4.  Subject-object  Versus  TBAC  views  of  access  control 


From  the  standpoint  of  access  control  models,  Figure  4  illustrates  how  the  TBAC  view  of  access  control 
differs  from  classical  subject-object  access  controls.  In  the  latter,  a  unit  of  access  control  or  permission 
information  can  be  seen  as  an  element  of  the  cross  product  of  three  domains  (sets),  namely  the  set  of 
subjects,  S,  the  set  of  objects,  O,  and  the  set  of  actions,  A.  In  TBAC,  access  control  involves  information 
about  two  additional  domains,  namely,  usage  and  validity  counts,  U,  and  authorization-steps,  AS.  These 
additional  domains  embed  task-based  contextual  information. 

In  our  further  discussions,  it  is  useful  to  be  aware  of  the  distinction  between  an  authorization-step  class 
(definition)  such  as  authorize-review  in  the  patent  workflow  definition  presented  earlier  and  an 
authorization-step  instance  in  a  particular  workflow  instance  such  as  authorize-review  in  patent  workflow 
instance  with  identifier  1234  started  at  9AM  on  Dec  1*',  1996.  We  use  the  term  authorization-step  loosely  to 
mean  authorization-step  class  or  authorization-step  instance,  as  determined  by  context.  When  the  context  is 
ambiguous  we  will  be  appropriately  precise. 


Figure  5  shows  the  concepts,  features,  and  components  that  make  TBAC  an  active  security  model.  These 
include  the  following: 

•  the  modeling  of  authorizations  in  tasks  and  workflows  as  well  as  the  monitoring  and  management  of 
authorization  processing  and  life-cycles  as  tasks  progress; 

•  the  use  of  both  type-based  as  well  as  instance  and  usage-based  access  control; 

•  the  maintenance  of  separate  protection  states  for  each  authorization-step; 


*  We  use  the  term  trustee  to  refer  to  any  one  of  the  following:  user,  process,  agent,  service  or  daemon. 
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•  the  dynamic  runtime  check-in  and  check-out  of  permissions  from  protection  states  as  authorization- 
steps  are  processed. 


workflows, 
authorizations 
dependencies, 
task  instances 


Type-based  access  control 


Instance  and  usage  based  access  control 


Figure  5.  TBAC  as  an  active  security  model 


Every  authorization-step  maintains  its  own  protection  state.  The  initial  value  of  a  protection  state  is  the  set 
of  permissions  that  are  turned  on  (active)  as  a  result  of  the  authorization-step  becoming  valid.  However,  the 
contents  of  this  set  will  keep  changing  as  an  authorization-step  is  processed  and  the  relevant  permissions  are 
consumed.  With  each  permbsion  we  associate  a  certain  usage  count.  When  a  usage  count  has  reached  its 
imit,  the  associated  permission  is  deactivated  and  the  corresponding  action  is  no  longer  allowed. 
Conceptually,  we  can  think  of  an  active  permission  as  a  check-in  of  the  permission  to  the  protection-state 
and  a  deactivation  of  a  permission  as  a  check  out  from  the  protection  state.  This  constant  and  automated 
check-in  an^d  cteckout  of  permissions  as  authorizations  are  being  processed  is  one  of  the  central  features 
that  make  TBAC  an  active  model.  Further,  the  protection  states  of  individual  authorization-steps  are  unique 
and  disjoint.  What  this  means  is  that  every  permission  in  a  protection  state  is  uniquely  mapped  to  an 
aut  onzation-step  instance  and  to  the  task  or  sub-task  instance  that  is  invoking  the  authorization.  This 

ability  to  associate  contextual  information  with  permissions  is  absent  in  typical  subject-object  style  access 
control  models.  ^  j  j 

The  distinction  between  type-based  and  instance  and  usage-based  access  control  is  also  a  significant  feature 
ot  the  TBAC  model.  Type-based  access  control  is  used  to  encapsulate  access  control  restrictions  as 
reflected  by  broad  policy  and  applied  to  types.  Instance  and  usage-based  access  control  on  the  other  hand,  is 
used  to  model  and  manage  the  details  of  access  control  and  protection  states  (permissions)  of  individual 
authorization  instances  including  keeping  track  of  the  usage  of  permissions. 

To  elaborate  more  on  how  these  concepts  are  used  for  the  active  management  of  authorizations,  consider  a 
simple  check-voucher  processing  example  that  involves  the  following  sequence  of  authorization  steps  (for 

brevity  we  show  only  the  name  of  the  authorization  step  and  the  trustee/role  that  can  request  the 
authorization).  ^ 

(1 )  authorize_prepare_voucher  •  clerk 

(2)  authorize_approve_voucher  •  supervisor 

(3)  authorize_issue_check  •  clerk 
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Thus  the  processing  of  the  voucher  involves  three  phases,  namely  prepare,  approval,  and  issue.  Each  phase 
involves  an  authorization.  As  soon  as  the  prepare  phase  is  initiated  at  the  task  or  workflow  layer,  there  will 
be  an  invocation  of  the  first  authorization  to  prepare  the  voucher  (authorization-step  1).  This  authorization 
is  requested  by  a  clerk,  say  C.  At  this  point  TBAC  will  utilize  type-based  access  control  and  an  access 
decision  function  to  check  that  entities  of  type  “clerk”  are  allowed  to  do  the  “authorize-prepare”  operation 
on  vouchers.  If  this  check  succeeds  TBAC  will  proceed  to  check-in  (or  activate)  the  required  permissions  so 
that  the  specific  clerk,  C,  (who  in  this  case  is  the  executor  trustee)  can  do  the  prepare  operation.  These 
permissions  are  checked  into  the  protection  state  of  step  1 .  As  mentioned  earlier,  we  call  these  permissions 
executor-permissions  as  they  permit  the  executor  to  process  the  authorization.  Now,  as  soon  as,  clerk  C  is 
done  with  preparing  and  authorizing  the  voucher,  we  consider  the  authorize_prepare_voucher  authorization- 
step  to  be  valid.  TBAC  will  now  do  two  things.  First,  TBAC  will  require  that  previously  checked-in 
executor  permissions  be  checked-out  (deactivated)  from  the  protection  state  of  the  authorization-step.  Next, 
TBAC  may  check-in  other  permissions  so  as  to  enable  the  processing  of  other  activities  including  the  next 
authorization-step  (step  2)  which  involves  authorization  for  the  approval  of  the  voucher  by  someone  in  the 
role  of  a  supervisor.  These  permissions  make  up  the  enabled’permissions.  During  the  processing  of  this 
second  authorization,  the  supervisor  may  consume  these  checked-in  enabled-permissions  and  as  a  result 
eventually  lead  to  them  being  checked-out,  and  so  on.  Eventually  when  step  1  becomes  invalid,  all  enabled- 
permissions  that  are  still  checked-in  (active)  will  be  deactivated  (checked-out).  Finally,  when  the  third 
authorization-step  authorize_issue_check  is  invoked,  the  organizational  policy  may  dictate  a  separation  of 
duties  requirement.  In  other  words,  the  clerk  that  is  the  executor-trustee  of  the  third  step  will  have  to  be 
different  from  the  clerk  that  was  the  executor-trustee  of  the  first  authorization,  authorize_prepare_voucher. 
However,  the  scope  of  such  a  requirement  may  be  limited  to  only  these  three  authorizations  and  not  to  the 
rest  of  the  authorizations  in  a  workflow.  To  facilitate  such  requirements,  TBAC  supports  notions  such  as 
start-conditions  and  scope  specifications  that  model  these  kinds  of  constraints  (discussed  later). 

In  summary,  TBAC  differs  from  traditional  passive  subject-object  models  in  many  respects  by  associating 
the  dimension  of  tasks  with  access  control.  First,  there  is  a  notion  of  protection  states,  which  represent 
active  permissions  that  are  maintained  for  each  authorization  step.  The  protection  state  of  each 
authorization  step  is  unique  and  disjoint  from  the  protection  states  of  other  steps.  Each  authorization-step 
corresponds  to  some  activity  or  task  within  the  broader  context  of  a  workflow.  Traditional  subject-object 
models  have  no  notion  of  access  control  for  processes  or  tasks.  Second,  TBAC  recognizes  the  notion  of  a 
life-cycle  and  associated  processing  steps  for  authorizations.  Third,  TBAC  dynamically  manages 
permissions  as  authorizations  progress  to  completion.  This  again  differs  from  subject-object  models  where 
the  primitive  units  of  access  control  information  contain  no  context  or  application  logic.  Also,  TBAC 
understands  the  notion  of  “usage”  associated  with  permissions.  Thus  an  active  permission  resulting  from  an 
authorization  does  not  imply  a  license  for  an  unlimited  number  of  accesses  with  that  permission.  Rather, 
authorizations  have  strict  usage,  validity,  and  expiration  characteristics  that  may  be  tracked  at  runtime.  In  a 
typical  subject-object  access  control  model,  a  permission  associated  with  a  subject-object  pair  implies 
nothing  more  than  the  fact  that  the  subject  has  the  permission  for  the  object.  There  is  no  recognition  or 
monitoring  of  the  usage  of  that  permission.  Finally,  TBAC  can  form  the  basis  of  self-administering  security 
models  as  security  administration  can  be  coupled  and  automated  with  task  activation  and  termination 
events. 


3.  A  Family  of  TBAC  Models 

Rather  than  formulating  one  simple  monolithic  model  of  TBAC  we  have  chosen  to  formulate  a  family  of 
models.  Before  discussing  the  models,  we  first  lay  out  a  framework  to  guide  us  in  designing  the  family  of 
models. 

3,1  Framework 

Our  framework  consists  of  formulating  a  simple  model  of  TBAC  called  TBACo  and  using  this  as  a  basis  to 
build  other  models. 


J42 


authorization^steps, 

dependencies 


Figure  6.  A  framework  for  a  hierarchy  of  TBAC  modek 


Figure  6  shows  our  framework.  TBAQ,  is  a  base  model  and  is  thus  at  the  bottom  of  the  lattice.  It  provides 

sTen!  ^  authorization-steps,  and  dependencies  relating  various  authorization- 

steps.  TBACo  IS  a  very  general  and  flexible  model  and  is  thus  the  minimum  requirement  for  any  system 

The  advanced  models  TBAC,  and  TBAC^  include  (inherit^TBACo 
Sereas  authorizations  (discussed  shortly) 

Foimulating  such  a  family  of  models  has  many  benefits.  Researchers  and  developers  can  compare  their 
system  implementation  of  TBAC  concepts  with  this  family  of  models.  Also,  a  Lily  of  Tderarves 
developers  various  choices  in  choosing  conformance  points  for  their  implementations  and  can  thus  serve  as 
a  guide  and  evolution  path  for  additional  features. 


3.2  The  model  TBACo 

We  will  „ow  describe  die  model  TBAQ  in  more  deuil.  We  describe  the  various  atdibutes  or  eompoaenm 

m::.rairSr,“rr  ““  ““  ■  •» 


3.2.1  Components  of  an  authorization-step 
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Every  authorization-step  has  to  specify  a  variety  of  attributes.  We  now  describe  briefly  each  of  these 
attributes  (components)  in  turn. 

•  Step-name:  this  is  the  name  of  the  authorization-step. 

•  Processing-state:  The  current  processing  state  indicates  how  far  the  authorization-step  has 
progressed  in  its  life-cycle  (discussed  shortly). 

•  Protection-state:  The  protection-state  defines  all  potential  active  permissions  that  can  be 
checked-in  by  the  authorization-step.  The  current  value  of  the  protection-state,  at  any  given 
time,  gives  a  snapshot  of  the  active  permissions  at  the  time.  Associated  with  every  permission 
is  a  validity-and-usage  specification.  The  validity-and-usage-specification  specifies  the 
validity  and  usage  aspects  of  the  permissions  associated  with  an  authorization-step.  It  will  thus 
specify  how  the  usage  of  the  permissions  will  relate  to  the  authorization  remaining  valid  (or 
becoming  invalid). 

•  Trustee-set.  This  contains  relevant  information  about  the  set  of  trustees  that  can  potentially 
grant/invoke  the  authorization-step  such  as  their  user-identities  and  roles. 

•  Executor-trustee:  This  records  the  member  of  the  trustee-set  that  eventually  grants  the 
authorization-step. 

•  Task-handle .  This  stores  relevant  information  such  as  the  task  and  the  event  identifiers  of  the 
task  from  which  the  authorization-step  is  invoked. 

Let  us  now  formalize  these  concepts. 

Definition  1 .  We  define  a  permission,  p,  as  a  tuple  {s.o,a,u.as)  where  .y  stands  for  the  subject  or  trustee  and 
o  represents  an  object  for  which  the  subject  is  given  the  right  to  perform  action  a  u  times  within  an 
authorization-step  instance  as.  A  permission  is  always  associated  with  an  authorization-step  instance  and  its 
associated  protection  state  (to  be  explained  shortly).  If  P  is  the  set  of  permissions,  then 

PcSxOxAxUxAS 

where  S  is  a  set  of  subjects/trustees 
O  is  a  set  of  objects 
A  is  a  set  of  action  names 

U  is  the  usage  and  validity  specification;  a  non-zero  integer  indicating 

the  number  of  uses  left  (the  special  symbol  » is  used  to  indicate  unlimited 
uses)  and  a  flag  v  indicating  if  the  last  usage  will  make  the  authorization-step 
invalid. 

AS  is  the  set  of  authorization-step  names. 

Definition  2.  For  each  authorization-step  instance  as,  there  is  an  associated  protection-state  SS«  defined 
by 

SS:  AS  2^ 

SS„  =  {(s,o,a,u,as')  €  P  I  as'  =  as  } 


Definition  3.  Each  authorization-step  instance,  as,  has  a  name  and  the  following  components: 

Processing-state,  PS  :  AS  — >  PS 
Protection-state,  SS  :  AS  2^ 
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Trustee-set,  TS  :  AS  2* 

Executor-trustee  ET  :  AS  -¥  S,  ET„  6  TS„ 

Task-handle  TH  :  AS  — >  T,  where  T  is  a  set  of  tasks 


We  also  state  informally  two  properties. 

"7:Z  "•  ““»■«  SD  ^^ponen.  is 

7:sz  the'isSof^C  “ 

Srr  various  authorization-steps  are 

joint.  Thus  every  authorization-step  instance  has  a  unique  protection  state.  Thus  given  a  set  of 

mo?e°onhese  s^r  ""n^:  •••"“’  respective  protection  states,  p,.  p,,  ...p,.  the  intersection  of  two  or 

more  ot  these  states  will  be  empty.  Formally, 

For  any  pi  and  pj,  i  is  not  equal  to  j,  Pi  n  pj  =  (|) 


3.2.2  Processing  states  and  life-cycle  of  authorizations 


Tn  IS  not  static;  rather  it  has  a  lifetime  and  a  life-cycle  associated  with  it 

better  understand  the  execution  aspects  of  authorizations,  it  is  useful  to  consider  the  v^ous 
processing  states  that  every  instance  of  an  authorization-step  goes  through  during  its  life-cycle. 

abort-f, 

tcrm-f 


Figure  8.  Basic  processing  states  for  an  authorization-step 

A  sirnple  view  of  this  life-cycle  is  to  consider  every  authorization-step  instance  as  going  through  five  states 

Th^s  noiT^"'’  ^'8ure  8.  An  authorization  is  dormant  when 

has  not  been  invoked  (requested)  by  any  task.  Once  invoked,  an  authorization-step  comes  into  existence 
and  will  be  processed.  I  this  processing  is  successful,  the  authorization-step  Lers  the 

erwise  “  becomes  invalid.  In  the  valid  state,  all  associated  permissions  with  the  authorization  are  turned 

further  r  ^  ‘consumption.  From  the  valid  state,  an  authorization-step  will  undergo 

further  processing  and  eventually  reach  the  end  of  its  lifetime  and  enter  the  invalid  state.  Also  a  valid 
au  horiza  lon-step  may  be  put  on  hold  temporarily.  When  this  happens,  all  permissions  associated ’with  (he 
authorization-step  are  inactive  and  cannot  be  used  to  gain  any  access  until  this  hold  is  released  and  Se 
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validity  reinstated.  Eventually,  when  an  authorization  becomes  invalid,  it  ceases  to  exist,  and  is  deleted  from 
the  system. 


However,  to  get  a  more  detailed  description  of  what  happens  to  an  authorization  during  its  lifetime,  one  can 
derive  a  more  elaborate  state  diagram  such  as  that  shown  in  Figure  9.  This  more  elaborate  state  diagram 
recognizes  the  dimension  of  usage  of  permissions.  A  permission  that  is  in  the  protection  state  of  an 
authorization-step  is  consumed  if  any  action  that  is  enabled  by  the  authorization-step  rec]uires  the 
permission.  Every  operation  request  thus  decrements  the  usage  count  of  the  permission.  Once  the  usage 
limit  is  reached  an  action  will  no  longer  succeed  as  TBAC  ensures  that  the  required  permission  is  no  longer 
available. 

Figure  9  is  a  direct  refinement  of  Figure  8.  The  aborted  and  started  states  of  Figure  9  are  a  refinement  of  the 
invoked  state  of  Figure  8.  Similarly,  the  valid,  hold  and  invalid  states  of  Figure  8  are  each  refined  into  a  pair 
of  corresponding  used  and  unused  states  in  Figure  9. 


Figure  9.  Detailed  processing  states  of  an  authorization-step 


We  describe  each  of  the  processing  states  below. 


•  Dormant:  An  authorization-step  is  in  this  state  if  it  has  not  been  invoked  by  any  task. 
Equivalently,  the  dormant  state  can  be  viewed  as  one  where  the  authorization-step  does  not  as 
yet  exist.  In  particular,  the  protection  state  of  the  authorization-step  is  empty. 

•  Started:  Once  an  authorization-step  has  been  successfully  invoked,  it  enters  this  state  where 
processing  begins. 

•  Aborted:  The  aborted  state  is  in  many  ways  similar  to  dormant  except  that  a  failed  attempt  to 
start  the  authorization-step  was  made  in  this  case. 

•  Valid-unused:  Once  an  authorization-step  has  been  started  subsequent  successful  processing 
will  transition  it  into  the  valid-unused  state. 

•  Valid-used:  If  an  authorization  was  in  a  valid-unused  state,  and  it  is  subsequently  used  or 
consumed,  then  it  enters  the  valid-used  state.  Depending  on  policy,  an  authorization  may  be 
used  multiple  times  before  it  enters  the  invalid  state. 


146 


•  Invalid-unused,  This  state  is  entered  if  certain  conditions  for  an  authorization  to  be  valid  are 
not  met  upon  termination  or  if  the  authorization  had  entered  the  valid-unused  state  and  was 
subsequently  revoked. 

•  Invalid-used:  This  state  is  entered  either  as  a  result  of  a  last-use  transition  from  the  valid- 
unused  state  or  as  a  result  of  a  revoke  or  last-use  event  (transition)  from  the  valid-used  state. 

•  Hold-unused:  In  this  stale  the  unused  authorization  is  temporarily  suspended.  All  associated 
permissions  will  thus  be  inactive. 

•  Hold-used:  The  authorization  is  temporarily  suspended.  All  associated  permissions  will  thus 
be  inactive. 


We  can  explain  some  of  the  semantics  associated  with  the  various  states  and  transitions  by  considering  the 
sample  authorization-step  below. 

•  authorize_prepare_voucher  •  clerk  ' 


In  this  example,  an  authorization  to  prepare  a  voucher  is  requested  by  a  user  in  the  role  of  a  clerk  When 
this  step  is  invoked  and  an  instance  of  this  authorization-step  is  created,  a  type-check  is  made  to  ensure  that 
the  “prepare”  permission  is  allowed  between  the  voucher  and  clerk  type.  If  this  check  succeeds,  the  step 
transitions  into  the  started  state  and  the  executor-permissions  are  checked-in  (activated)  into  the  protection- 
state.  Between  the  started  and  valid-unused  or  invalid-unused  states,  there  are  no  changes  in  the  protection 
state  Once  the  step  reaches  the  valid-unused  state,  the  executor-  permissions  are  checked  out  and  the 
enabled-permissions  are  checked  into  the  protection  state. 

These  enabled-permissions  will  allow  other  actions  to  continue  in  the  overall  workflow.  At  some  point,  the 
authorization-step  will  become  invalid  and  any  remaining  permissions  in  the  enabled-permissions  set  will  be 
checked  out  (deactivated). 


3.2.3  Basic  dependencies  to  construct  authorization  policies 


In  the  previous  sections,  we  discussed  authorization-steps.  However,  in  any  application  or  workflow  logic, 
authorization  steps  do  not  stand  in  isolation.  Rather,  they  are  often  related  and  dependent  on  each  other  due 
o  policy  implications.  We  now  discuss  various  dependencies  and  constructs  that  relate  authorization-steps 
to  each  other  and  constrain  their  execution  and  behavior.  These  dependencies  can  thus  be  used  to  formulate 
enterprise-oriented  authorization  policies. 

We  specify  dependencies  in  terms  of  existential,  temporal,  and  concurrency  relationships  that  hold  between 
events  (or  states  resulting  from  the  occurrence  of  events).  Given  an  authorization-step  A.  we  use  the 
tollowing  notation  for  the  various  states  of  A: 

•  A^ :  the  dormant  state 

•  A^  :  the  started  state 

•  A"  :  the  aborted  state 

•  A**'  :  the  valid-unused  state 

•  A*"^  :  the  valid-used  state 

•  A*’  :  the  invalid-unused  state 

•  A*"^  :  the  invalid-used  state 

•  A*';  the  hold-unused  state 

•  the  hold-used  state 


We  list  the  dependency  types  and 
1 


their  meanings  (interpretations)  below. 

if  A1  transitions  into  state  1,  then  A2  must  also 
transition  into  state2. 
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if  both  A1  and  A2  transition  into  states  state  1  and 
state2  respectively,  then  ATs  transition  must  occur 
before  A2’s 

A1  cannot  be  in  state  1  concurrently  when  A2  is  in 
state2. 

A1  must  concurrently  be  in  state  1  when  A2  is 
in  state2. 

and  <  express  existential  and  temporal  predicates  and  as  such  are  best 
interpreted  as  predicates  between  transition  events  that  lead  to  changes  in  the  processing  states  of 
authorization-steps.  They  were  originally  proposed  by  Klein  in  [6]  to  capture  the  semantics  of  database 
transaction  protocols.  The  other  dependencies  express  concurrency  properties. 

In  a  later  subsection,  we  will  illustrate  the  use  of  these  dependencies  with  an  order-processing  example.  Let 
us  now  formalize  the  concept  of  dependencies. 

Definition  4.  We  define  a  dependency  type  as  one  of  the  following:  <,  #,  or  111  and  DT  the  set  of 

dependency  types  as  <,  #,  III}; 


2.  A1"‘“''<A2*“"^• 

3.  A1""''#A2"‘“'^ 
4  a1*‘"'M|IA2’“'^: 

The  first  two  dependency  types 


Definition  5.  We  define  a  dependency  instance  as  a  tuple  {al.  dt,  a2)  for  which  an  assignment  relation 
holds  from  al  to  a2.  If  D  is  the  set  of  dependencies,  then 

D  c  AS  X  DT  X  AS 


3.2.4  Formal  characterization  of  TBACo 
Wc  now  formally  define  model  TBACq  as  follows. 

Definition  6.  The  TBAQ  model  consists  of  the  following; 

•  AS,  a  set  of  authorization  steps; 

•  SS,  a  set  of  protection  states; 

•  P,  a  set  of  permissions; 

•  D,  a  set  of  dependency  instances; 

•  astep:  SS  AS,  a  function  mapping  each  protection-state  to  a  single  authorization-step; 

•  pstate:  AS  SS,  a  function  mapping  each  authorization-step  to  a  single  protection  state. 


3.3  The  model  TBACj  to  support  composite  authorizations 

The  model  TBACj  supports  the  notion  of  composite  authorizations.  A  composite  authorization  is  an 
abstraction  that  encapsulates  two  or  more  authorization-steps.  This  is  convenient  when  an  authorization-step 
is  too  fine-grained  a  unit  to  express  authorization  requirements  at  a  high  (abstract)  level. 

For  example,  consider  the  authorization  to  transfer  funds  from  one  bank  account  to  another.  Such  an  action 
typically  requires  two  authorizations.  The  first  authorization  is  for  withdrawal  of  funds  from  the  source 
account  and  the  second  to  deposit  funds  into  the  target  account.  However,  it  is  useful  for  modeling  purposes 
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to  think  of  a  more  composite  abstraction  called  “authorize-transfer”  that  consists  of  the  individual 
authonzation-sieps. 


Thus  a  composite-authonzation  consists  of  a  set  of  component  authorization-steps.  These  component 
aut  onzation-steps  can  be  related  to  other  steps  within  the  same  composite-authorization  through  various 
dependencies.  In  other  words,  the  authorization-steps  of  a  composite-authorization  are  not  visible  externally 
to  other  authorization-steps  outside  the  composite-authorization.  The  motivation  for  this  restriction  comes 
from  a  desire  to  follow  sound  software-engineering  principles,  especially  those  related  to  encapsulation  and 
information  hiding.  Thus  to  the  external  world,  a  composite-authorization  is  a  single  abstraction. 

Collectively,  the  above  properdes  and  restrictions  impose  different  semantics  during  the  lifetime  of  a 
composite-authorization.  In  particular,  we  have  to  reexamine  the  notions  of  when  we  consider  a  composite- 
authorization  to  be  started,  valid,  and  invalid.  We  approach  these  issues  by  associating  a  critical-set  of 
component  authorization-steps  with  every  composite-authorization.  The  critical-set  is  a  subset  of  the  total 
number  of  component  authorization-steps.  We  consider  a  composite-authorization  to  have  started  when  any 
member  of  the  critical-set  has  reached  the  started  state.  To  be  considered  valid,  all  steps  in  the  critical-set 
have  to  reach  their  respective  valid  states.  On  the  other  hand,  a  composite-authorization  is  considered 
invalid  as  soon  as  any  step  in  the  critical-set  becomes  invalid. 


In  addition  to  the  validity  associated  with  the  critical-set,  a  composite-authorization  may  declare  other  non- 
critical-sets  of  authorization-steps  to  capture  additional  states  of  validity.  However,  these  other  sets  can 
become  va  id  only  when  the  critical-set  itself  is  valid  and  can  remain  valid  only  as  long  as  the  critical-set 
remains  valid  Collectively,  the  critical-set  along  with  the  various  non-critical  sets,  define  progressive  states 
(checkpoints)  of  validity.  The  specification  of  a  critical-set  within  a  composite-authorization  should  thus  be 
done  with  careful  thought  given  to  some  minimal  notion  of  validity  that  ensures  consistency  with 
authorization  policies  for  the  enterprise. 

3.4  The  model  TBAC2  ond  constraints 

As  mentioned  earlier,  TBACj  supports  more  advanced  notions  of  constraints.  Thus  TBACj  would  be  more 
suitable  for  an  organization  that  finds  TBAQ  to  be  too  open-ended  or  not  having  tight  enough  controls. 

We  classify  constraints  as  static  or  dynamic  constraints.  Static  constraints  are  those  that  can  be  defined  and 
enforced  when  authorization-steps  are  specified.  Dynamic  constraints  on  the  other  hand,  are  those  that  can 
be  evaluated  only  at  runtime  as  authorization-steps  are  being  processed. 

authorization-step  has  two  components  in  addition  to  those  present  in 

IBACo.  We  describe  these  below. 


•  Start-condition  (SC).  This  component  can  be  used  to  specify  a  rich  set  of  constraints  that  govern 
whether  an  authorization-step  can  transition  into  the  started  state. 

•  Scope  (SP).  This  component  controls  the  visibility  of  an  authorization-step  with  respect  to  other 
authorization-steps  when  formulating  and  enforcing  authorization  policies.  Thus  scope  can  be  used  to 
control  if  an  authorization-step  is  visible  to  an  entire  workflow,  a  task,  or  other  finer  units  such  as  sub- 


We  are  currently  investigating  other  static  constraints  for  authorization- 


■steps  such  as: 

Constraints  on  processing  state:  This  constraint  can  be  used  to  remove  certain  processing  states  (such 
as  hold)  from  the  life  cycle  of  a  step. 

Constraints  on  protection  state:  This  can  be  used  to  constrain  the  permissions  that  are  allowed  in  the 
protection  state  (i.e.  activated  by  the  step). 

Constraints  on  trustee-set:  This  can  be  used  to  constrain  the  type  as  well  as  the  instances  of  the 
trustees  that  can  belong  to  this  set.  For  example,  we  may  want  to  constrain  that  the  trustee  be  of  type 
role  and  limited  to  instances  of  project  managers  and  supervisors. 
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•  Constraints  on  executor  permissions:  This  can  be  used  to  specify  what  permissions  are  not  allowed  to 
be  among  the  operating  permissions. 

The  most  obvious  examples  of  dynamic  constraints  are  those  involving  dynamic  separation  of  duties/roles 
and  coincidence  of  roles.  Consider  the  following  four  authorizations  (for  brevity  we  show  only  the  step 
name  and  the  trustee-name  specified  in  terms  of  roles). 

Al:  auth_prepare_check  •  clerk 
A2:  auth_approve_check  •  supervisor 
A3:  auth_issue_check  •  clerk 
A4:  auth_reapprove_check  •  supervisor 

To  prevent  fraud  and  implement  various  checks  and  balances,  the  enterprise  policy  may  dictate  that  the 
clerks  performing  steps  Al  and  A3  be  distinct  (separation  of  duties)  while  the  supervisors  for  steps  A2  and 
A4  be  the  same.  However,  since  any  clerk  or  supervisor  in  the  enterprise  may  be  allowed  to  first  perform 
Al  and  A2  respectively,  these  constraints  can  be  evaluated  only  at  runtime.  TBAC2  allows  for  the 
specification  of  such  dynamic  constraints.  These  dynamic  constraints  are  evaluated  by  looking  at  the  history 
of  the  executor  trustees  in  the  authorizations  that  have  been  invoked.  TBACj  also  allows  considerable 
modeling  flexibility  by  allowing  the  reach  of  such  dynamic  constraints  to  be  influenced  by  other  static 
constraints  such  as  scope.  Thus  we  may  specify  that  a  dynamic  separation  of  duties  requirements  hold 
across  the  scope  of  a  sub-task,  task,  or  other  coarser  units.  By  keeping  track  of  the  executor  trustees  of 
invoked  authorizations  and  combining  the  notions  of  dependencies  and  scope,  the  TBAC2  model  can  be 
used  to  provide  a  much  more  powerful  and  general  approach  to  specifying  separation  of  duties  requirements 
than  transaction  control  expressions  (proposed  in  [11]). 


4.  Conclusions  and  Summary 


We  have  described  an  active  approach  and  a  family  of  models  for  authorization  management,  collectively 
called  task-based  authorization  controls  (TBAC).  Our  approach  differs  from  passive  subject-object  models 
in  many  respects.  Permissions  are  controlled  and  managed  in  such  a  way  that  they  are  tumed-on  only  in  a 
just-in-time  fashion  and  synchronized  with  the  processing  of  authorizations  in  progressing  tasks.  An 
authorization-step  is  a  fundamental  abstraction  in  TBAC  and  is  used  to  group  and  manage  a  set  of  related 
permissions.  To  enable  this,  TBAC  supports  the  notion  of  a  lifecycle  for  an  authorization-step.  Further, 
TBAC  keeps  track  of  the  usage  and  consumption  of  permissions,  thereby  preventing  the  abuse  of 
permissions  through  unnecessarily  and  malicious  operations.  TBAC  provides  for  the  modeling  of 
enterprise-oriented  authorization  policies  using  dependencies  that  relate  authorizations  according  to  some 
enterprise  policy.  Our  long-term  goal  is  to  develop  a  variety  of  tools  for  the  modeling  and  enforcement  of 
authorization  policies.  The  modeling  tools  will  enable  a  security  officer  to  formulate  as  well  as  modify 
authorization  policies  using  the  paradigm  of  visual  languages  for  interaction.  Enforcement  will  be  achieved 
through  authorization  servers  that  load  stored  policies  and  enforce  the  policies  at  runtime  when  tasks  and 
workflows  are  invoked. 

We  are  currently  investigating  several  issues.  The  consolidated  model  TBAC3  needs  further  examination.  In 
particular,  the  interaction  of  composite-authorizations  from  TBACj  and  constraints  from  TBAC2  requires 
further  study.  We  are  also  looking  at  formulating  higher  level  modeling  constructs  for  authorizations  that 
can  be  composed  from  the  five  types  of  dependencies  mentioned  in  the  paper.  For  example,  it  might  be 
useful  to  have  a  construct  to  express  atomicity  semantics  on  the  validity  of  a  set  of  authorizations.  Also  of 
interest  is  a  framework  to  cohesively  model  and  understand  various  constraints.  From  the  standpoint  of 
building  end  user  tools,  we  are  exploring  various  aspects  of  visual  languages  and  in  particular  visual 
metaphors  and  related  policy  grammars  to  be  used  by  end  user  tools  to  express  authorization  policies.  The 
mapping  of  policy  sentences  to  dependencies  and  various  security  rules  that  will  be  automatically 
incorporated  into  workflow  task  definitions  is  also  under  investigation.  Also  under  investigation  are  issues 
related  to  the  delegation  and  revocation  of  authorizations  and  their  related  permissions. 
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Abstract 

Workflow  Management  (WFM)  Systems  automate  traditional  processes  where  in¬ 
formation  flows  between  individuals.  WFM  systems  have  two  major  implications  for 
security.  Firstly,  since  the  description  of  a  workflow  process  explicitly  states  when 
which  function  is  to  be  performed  by  whom,  security  specifications  may  be  automati¬ 
cally  derived  from  such  descriptions.  Secondly,  the  derived  security  specifications  have 
to  be  enforced.  This  paper  considers  these  issues  for  a  Cyberspace  workflow  system 
by  describing  a  small,  but  comprehensive  example. 

The  notion  of  an  Alter-ego  is  central  in  this  description:  Alter-egos  are  objects  that 
represent  individuals  in  Cyberspace  (and  not  merely  identify  them).  In  Cyberspace, 
documents  in  a  workflow  system  therefore  flow  between  Alter-egos,  rather  than  be¬ 
tween  individuals. 


Keywords 

Security  and  Database  systems,  Workflow,  Cyberspace,  Object-Oriented  Databases,  Role- 
based  security 

1  Introduction 

Workflow  Management  (WFM)  Systems  automate  traditional  processes  where  information 
flows  between  individuals.  Although  WFM  systems  have  been  in  existence  for  a  number 
of  years,  the  trend  towards  greater  interconnection  will  greatly  impact  such  systems.  On 
the  one  hand,  interaction  will  involve  more  and  more  nonhuman  participants.  On  the 
other  hand  the  participants  in  workflow  processes  will  become  more  and  more  unrelated. 
To  illustrate  the  latter  trend,  consider  the  following  ‘generations’  of  workflow  systems: 
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1.  All  individuals  who  take  part  in  a  workflow  process  are  typically  part  of  the  same 
organization. 

2.  Individuals  who  take  part  in  a  workflow  system  are  not  necessarily  members  of  the 
organization  who  ‘owns’  the  WFM  system,  but  are  registered  with  that  organization. 

3.  Participants  in  a  workflow  process  may  never  have  had  any  contact  prior  to  partici¬ 
pating  in  the  same  workflow  process. 

The  key  to  secure  implementation  of  later  generation  WFM  systems  is  proper  authen¬ 
tication  and  authorization  of  participants  in  a  workflow  process.  It  is  our  contention  that 
AJter-egos  (see  the  next  section)  are  particularly  suitable  for  authentication,  while  roles 
are  particularly  suitable  for  authorization.  Stated  differently:  we  will  assume  that  a  po¬ 
tential  participant  will  present  an  Alter-ego  that  wiU  serve  as  proof  of  the  participant’s 
identity. 

The  intention  of  this  paper  is  to  study  the  derivation  of  security  rules  from  a  WFM 
design  tool  and  to  consider  how  such  rules  may  be  implemented.  Of  particular  concern  is 
the  distiiiction  between  the  individual,  represented  by  an  Alter-ego,  and  the  role  in  which 
the  individual  acts. 

The  paper  is  structured  as  follows:  The  following  section  gives  some  background  on 
Alter-egos,  workflow  and  security.  Section  3  introduces  the  insurance  claim  example  that 
is  used  in  this  paper  and  shows  how  security  specifications  may  be  derived  from  the 
workflow  specification.  Section  4  discusses  an  implementation  strategy  for  a  workflow 
process.  Section  6  contains  the  conclusions  of  the  paper. 

2  Background 

2-1  Alter-egos 

Individuals,  either  in  an  office  environment,  or  in  their  homes,  will  be  represented  in 
Cyberspace  by  objects,  called  Alter-egos,  in  the  sense  of  Object-Oriented  Technology, 
and  may  be  considered  a  combination  of  Social  Security  Number  and  e-maU  address! 
They  were  introduced  in  [van  de  Riet  k  Gudes96],  where  it  was  shown  how  these  Alter- 
egos  can  be  structured  and  how  Security  and  Privacy  (S&P)  aspects  can  be  dealt  with. 
Questions  around  Responsibility  and  Obfigations  of  Alter-egos  have  been  discussed  in 
[van  de  Riet  &  Burg96a,  van  de  Riet  k  Burg96b]. 

The  use  of  Alter-egos  to  provide  high-level  security  has  been  discussed  in  an  earlier 
paper  [van  de  Riet  k  Gudes96].  The  main  idea  was  that  if  the  underlying  communication 
system  of  Cyberspace  ensures  that  every  message  contains  an  unforgeable  Alter-ego  of  the 
sender  (or  initiator),  one  can  design  more  powerful  and  higher  level  protection  mechanisms 
than  those  existing  today  and  which  rely  mainly  on  encrypting  messages. 
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2.2 


Workflow 


In  Workflow  management  (WFM)  applications  there  are  tasks  to  be  completed  by  some 
organization,  but  the  organization  procedures  require  that  this  task  will  be  carried  out  in 
steps  where  each  step  is  executed  by  a  different  individual  and  no  step  can  be  performed 
before  the  steps  it  depends  on  are  completed  [Georgakopoulos95].  We  shall  demonstrate  a 
certain  WFM-tool,  COLOR-X,  developed  by  the  group  in  Amsterdam  to  model  Informa¬ 
tion  and  Communication  Systems,  using  linguistic  knowledge,  and  we  will  see  how  S&P 
rules  can  be  derived  from  COLOR-X  diagrams. 

WFM  tools  are  currently  being  used  to  specify  how  people  and  information  systems 
are  cooperating  within  one  organization.  There  are  at  least  three  reasons  why  WFM 
techniques  are  also  useful  in  Cyberspace.  First,  organizations  tend  to  become  multi¬ 
national  and  communication  takes  place  in  a  global  manner.  Secondly,  more  and  more 
commerce  is  being  done  electronically.  This  implies  that  procedures  have  to  be  designed 
to  specify  the  behaviour  of  the  participants.  These  procedures  may  be  somewhat  different 
from  ordinary  WFM  designs,  where  the  emphasis  is  on  carrying  out  certain  tasks  by  the 
users,  while  in  commerce  procedures  are  based  on  negotiating,  promises,  commitments 
and  deliveries  of  goods  and  money.  However,  as  we  will  see,  these  notions  are  also  present 
in  the  WFM  tool  we  will  use.  Thirdly,  people  will  be  participants  in  all  kinds  of  formalized 
procedures,  such  as  tax  paying  or  home  banking. 

2.3  Workflow  and  Security 

This  being  said,  how  can  we  derive  security  and  privacy  rules  from  the  Work-flow  di¬ 
agrams  (WFDs)?  Specifying  tasks  and  actions  of  people  working  in  an  organization 
naturally  also  involves  the  specification  of  their  responsibilities  [van  de  Riet  &  Burg96a, 
van  de  Riet  &  Burg96b,  01ivier96].  This  is  what  WFDs  usually  do.  Responsibility  implies 
access  to  databases  to  perform  certain  actions  on  data  of  individuals. 

A  Workflow  Authorization  Model  is  proposed  in  [Atluri  &  Huang96b].  Authorization 
Templates  are  associated  with  each  workflow  task  and  used  to  grant  rights  to  subjects 
only  when  they  require  the  rights  to  perform  tasks.  A  Petri  net  implementation  model 
is  also  given.  Where  [Atluri  &  Huang96a]  focusses  on  synchronising  workflow  and  au¬ 
thorization  flow,  the  current  paper  focusses  on  the  relationship  between  individuals  (rep¬ 
resented  by  Alter-egos)  and  the  roles  they  occupy  in  such  workflow  processes,  while  in 
[Atluri  k  Huang96a]  the  authors  emphasize  information-flow  issues  using  a  multi-level 
security  model. 
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Claim:  Deny 
hy  Clerk  or 
Expert 

as  Approver  . 

V  y 


Approve 
by  Clerk  or 
Expert 

\as  Approver 


Figure  1:  Fragment  of  a  workflow  process 
2.4  Alter-egos,  roles  and  workflow  security 

As  noted  earlier,  it  is  our  contention  that  Alter-egos  are  particularly  suitable  for  authen¬ 
tication,  while  roles  are  particularly  suitable  for  authorization.  Not  only  are  Alter-egos 
suitable  for  authentication  the  nature  of  the  mechanism  to  represent  participants  has 
to  progress  towards  full-function  Alter-egos.  Similarly,  the  role  concept  needs  to  evolve 
from  that  required  by  the  earlier  generations  to  that  required  by  the  later  generations. 

Alter-egos  required  by  first  generation  WFM  systems  primarily  serve  to  authenticate 
the  user  to  the  system.  A  user  identifier  with  a  password  or  PIN  (personal  identification 
number)  may  be  adequate.  In  second  generation  systems  the  requirements,  in  addition, 
include  privacy,  integrity  and  non-repudiation.  These  issues  become  important  since  many 
of  the  participants  are  not  employed  by  the  organization  and  are  accessing  the  system 
from  remote  systems.  In  third  generation  systems  these  issues  are  still  important,  but  it 
also  becomes  important  to  construct  the  Alter-egos  such  that  they  can  be  trusted  by  the 
represented  individual  to  act  as  agent  for  the  individual. 

3  The  Insurance-claim  Application 

In  this  paper  the  following  simple  fragment  of  a  workflow  application  wiU  be  used  to  il¬ 
lustrate  the  concepts  involved:  A  claimant  submits  an  insurance  claim,  which  is  either 
approved  or  rejected  by  some  approver.  If  the  claim  is  not  approved  (or  rejected)  within 
some  specified  period,  the  approver  is  reminded  to  give  attention  to  this  claim.  This  frag¬ 
ment  is  depicted  graphically  in  figure  1.  For  another  example,  see  [van  de  Riet  &  Burg97]. 
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The  following  requirements  are  obvious,  even  from  this  simple  example: 

1.  The  approver  has  to  be  duly  authorized  to  approve  the  claim.  The  requirements  for 
such  authorization  depend  on  the  specific  application;  the  mechanisms  to  enforce 
such  requirements  are  the  concern  of  this  paper. 

2.  Access  to  the  documents  involved  in  the  workflow  depend  on  the  roles  individuals 
play  in  the  workflow  process. 

A  specification  for  the  same  process,  but  using  a  COLOR-X  diagram,  is  given  in  figure 
2.  Note  that  in  figure  2  we  used  the  following  conventions:  A  box  in  figure  2  (a)  denotes  an 
entity  or  type.  A  line  is  a  relationship  and  a  line  with  open  arrow  is  an  is.a  relationship. 
The  circle  with  an  X  means  exclusion.  In  figure  2  (b)  we  have: 

•  each  box  of  actions  has  a  mode:  PERMIT,  NEC  or  MUST.  The  latter  one  means  an 
obligation  based  on  some  negotiating  in  the  past:  as  we  are  not  sure  that  the  action 
is  actually  carried  out  within  the  prescribed  time  it  is  necessary  to  define  a  counter 
measure.  The  mode  NEC  means  we  can  be  sure  the  action  is  necessarily  carried  out 
by  the  system.  PERMIT  is  self  evident. 

•  the  actions  are  described  in  a  formal  language  involving  the  participants  and  their 
roles; 

•  the  lightning  arrow  denotes  a  situation  in  which  the  conditions  in  the  identification 
part  (denoted  by  id)  are  not  satisfied.  In  the  diagram  we  let  the  reminders  go 
indefinitely,  which  was  done  for  shortness  sake; 

•  The  “approve”  and  “deny”  boxes  from  figure  1  correspond  to  the  arrows  R2= “approve” 
and  R2=“deny”  going  out  the  lowest  but  one  box  in  figure  2  (b). 

We  now  derive  the  authorization  tuples  from  the  diagrams  above.  We  use  the  following 
heuristic  rules: 

1.  If  an  action  involves  data  in  a  database,  the  agent  of  this  action  should  be  authorized 
to  perform  the  corresponding  actions  on  the  database. 

2.  An  action,  with  modality  MUST,  involves  an  obligation  to  perform  a  specific  action 
within  a  prescribed  amount  of  time.  This  implies  that  in  some  database,  oblDB,  the 
administration  about  this  obligation  is  kept.  The  object  which  creates  the  MUST 
action,  i.e.  the  agent  of  the  action  leading  to  this  MUST  action,  can  move  the 
deadline  (only  shift  it  to  the  future  of  course);  that  is  explicitly  not  allowed  to  the 
object  who  has  to  carry  out  the  action.  Of  course  this  object  can  refuse  to  carry  it 
out,  but  then  penalties  may  be  the  result. 
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3.  In  a  send  or  reminder  action  the  sender  is  assumed  to  write  in  the  message  database, 
while  the  receiver  is  assumed  to  be  able  to  read  from  it. 

The  databcises  involved  are: 

•  For  the  claims:  claimDB; 

•  For  obligations:  oblDB;  and 

•  For  the  messages:  messDB. 

The  syntax  we  will  use  to  indicate  that  an  actor  in  some  role  is  authorized  to  perform  a 
specific  operation  on  a  specific  database,  is  as  follows: 

AUTH  <naiiie  database,  role  actor,  operation> 

From  the  diagram  we  thus  have  the  following  authorization  tuples: 

AUTH<claiiiiDB,  claimant,  add> 

AUTH<claimDB ,  insurance_company,  read> 

AUTH<claimDB,  approver,  read> 

AUTH<claimDB,  cashier,  create> 

AUTH<oblDB,  claimant,  shift> 

AUTH<oblDB,  approver,  not  shift> 

AUTH<messDB,  insurance_company,  write> 

AUTH<messDB,  approver,  read> 

AUTH<messDB,  cashier,  read> 

Furthermore,  from  figure  2  (b)  we  derive  that  there  is  a  partial  ordering  with  respect 
to  authorization  for  actions  concerning  the  claimDB,  for  the  following  roles: 

clerk  >>  approver 
expert  >>  approver 

According  to  figure  2  (a)  all  three  roles  or  types  are  subtypes  of  employee.  Note  also  that 
an  approver  cannot  be  the  same  person  as  the  claimant. 

The  authorization  tuples  as  derived  above  may  in  general  conflict,  as  there  are  positive 
and  negative  tuples.  In  the  case  of  an  approver  not  being  allowed  to  shift  an  obligation, 
we  also  may  have  the  situation  that  the  approver  sets  a  deadline  for  someone  else,  in  which 
case  there  will  appear  an  authorization  tuple: 

AUTH<oblDB,  approver,  shift > 
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PERMIT 

submit(ag=person  P)(go=claim  CL) 
(rec=insurance  company  IC)(tmp=:temp  Tl) 

MUST 

verify(ag=approver  AP)(go=CL)(tmp=time  T2) 

id:  is-a(approver)(clerk)(sit:  CL.amount  <  100) 
id:  is-a(approver)(expert)(sit:  CL.amount  >  100) 
id:  T2  <  Tl  -1-  period 

— 1 

h  1 

Tl:=NOW 

NEC 

send(ag=IC)(go=rcminder  Rl)(r 

ec=approver  AP) 

NEC 

send(ag=IC)(go=result  R2)(rcc=person  P) 

“1 

w  R2=”  approve*' 


_ NEC _ 

sencl(ag=IC)(go=daim  CL)(rec=cashier  CA) 


(a) 


R2="  deny" 


(b) 


Figure  2:  Application  specified  using  a  COLOR-X  diagram 
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which  is  evidently  in  conflict  with  the  one  we  mentioned.  The  reason  for  this  problem 
is,  however,  very  simple:  oversimplification.  In  actual  practice  we  would  also  register  the 
specific  task  to  be  performed,  and  not  merely  allow  access  to  an  entire  database. 

The  above  security  checks  represent  simple  access-control  rules  which  can  be  automat¬ 
ically  derived  from  the  Workflow  specifications.  However,  in  reality  the  situation  is  much 
more  complex.  First,  there  may  be  more  complex  constraints,  such  as:  a  claimant  and  an 
approver  may  not  be  the  same  individual  (alter-ego).  The  issue  of  constraints  is  discussed 
further  in  section  4.2.  Second,  the  distinction  between  a  Role  and  Individual  (Alter-ego) 
may  be  important.  In  some  instances,  only  the  role  is  required  to  approve  access.  In 
other  instances,  only  a  specific  Alter-ego  may  approve  a  claim.  This  is  further  discussed 
in  section  4.3.  Thirdly,  the  above  rules  assume  a  static  state  where  the  agents  represented 
by  Alter-egos  and  Roles  are  “alive”  and  respond  in  time.  However,  time  is  an  important 
factor.  Rights  of  some  individuals  may  be  time  dependent  (see  also  [Bertino94]).  In  par¬ 
ticular,  users  may  change  their  role  in  the  organization,  new  roles  may  need  to  be  created 
for  a  particular  alter-ego  (e.g.  the  expert  role),  or  deleted.  The  administration  of  roles 
and  its  dynamics  are  therefore  an  important  issue.  This  is  discussed  in  section  4  4 


4  Implementation 

The  implementation  of  the  above  model  largely  depends  on  the  underlying  object  sys¬ 
tem.  Below  we  sketch  our  approach  to  implement  it  in  Mokum.  Note  the  fact  that  the 
various  agents  are  responsible  for  performing  the  required  security  checks.  This  avoids 
e  ottlenecks  associated  with  a  centralised  checking  facility  in  a  distributed  system  It 
also  provides  more  flexibility  in  the  types  of  checks  that  may  be  performed  over  those 
from  a  standardised  facility.  The  fact  that  the  checks  are  interspersed  through  the  code 
is  not  a  concern  because  the  intention  is  to  automatically  derive  the  code  and  the  checks 
rom  t  e  workflow  specification.  The  following  discussion  concentrate  on  implementing 
the  run-time  access-control,  the  administration  issues  are  discussed  in  Section  4.3. 


4.1  Implementation  based  on  Mokum 

The  mam  idea  here  is  to  use  the  “Collection  Keepers”  (van  de  Riet  &  Gudes96]  to  exe¬ 
cute  the  customized  (knowledge-based)  security  checks.  Some  of  the  Workflow  security 
related  pseudo-code  performed  by  each  participant  (similar  to  Mokum’s  code  described  in 
[van  de  Riet  &  Gudes96])  is  shown  below. 

Authorization  tuples  as  described  earlier,  are  defined  as  follows: 


type  authorization. tuple 
has. a  alter. ego:  thing 
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has. a  role:  thing 
has. a  database:  thing 
has. a  access.mode :  thing 

The  Application  Administrator  is  responsible  for  performing  additional  authentica¬ 
tion  checks  and  for  authorizing  operations  in  the  workflow  system.  In  order  to  do  this, 
the  Application  Administrator  maintains  the  authorization  database  and  processes  the 
authentication  and  authorization  events.  For  brevity  the  details  of  the  code  have  been 
omitted. 

The  insurance  company  receives  the  claim  and  directs  it  to  an  approver,  which  may 
be  a  clerk  or  an  expert. 

type  insurance. company  is. a  thing 
script 

at .trigger  submission: 
message : rolesclaimant , 

A  «  message : claim: amount , 

(A  <  100,  choose.clerk(AP) ;  choose.expert(AP)) , 
add.typeCAP,  approver). 

The  claimant  is  the  submitter  of  the  claim. 

t3rpe  claimant  is.a  person 
script 

at.trigger  send.claim: 

createCM,  messageT,  [(data.base=claimDB) ,  (access.mode=add)]) , 
sendCapplication.administrator,  authenticate.user,M) , 
compose  (Claim) , 

/»  using  the  incident  and  the  amount  of  money  */ 

/*  involved  by  interaction  with  the  user  */ 

Claim  to  M: claim, 

sendCapplication. administrator, authorize.user,M) , 

/*  Note  that  the  authorization  goes  before  the  following  ♦/ 

/*  action,  so  if  it  fails  the  next  action  will  not  be  ♦/ 

/♦  executed  */ 

send  (insurance.company, submission,  H). 
end. script 

The  approver  is  the  clerk  or  expert  responsible  for  verifying  and  approving  claims, 
type  approver  is.a  employee 
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script 

at  .trigger  verify.claim,  reiainder_message : 

/*  note  tlia.t  for  the  sake  of  brevity  we  let  the  approver  */ 

/♦  do  the  sane  when  it  receives  a  new  case  or  when  he  */ 

/*  receives  reminder  for  some  case  */ 

CL=message: claim,  C=message: client, 
createCMl,  message!,  [(data_base=claimDB) , 

(access_mode=read) , (claim,message:claim)] ) , 
send(application_administrator,authori2e_user,Ml) , 
show. claim. on.screen(CL) , 

/*  now  the  approver  may  take  a  few  days  to  verify  the  claim*/ 
verify.claimCCL,  Result), 

create(M2,  message!,  [(client=C),  (result=Result] )] ) , 
s  end ( insurance. comp  any ,  r e  ce i ve .f rom. approver , M2 ) 
end.script 

4.2  Constraint  checking 

Although  not  shown  in  the  example  given  in  figure  2,  it  is  possible  to  specify  constraints 
in  the  WFM  diagrams.  Such  constraints  may  have  the  form  “the  approver  of  a  claim  must 
be  different  from  the  claimant”.  These  constraints  can  also  be  specified  in  the  Mokum 
scripts  of  the  Alter-egos  involved. 

Constraints  may  be  specified  as  pre-conditions,  post-conditions  or  through-conditions. 
Ideally,  constraints  are  checked  by  the  various  choose  methods.  For  example,  when  the 
choose_expert  method  is  selected  above  one  could  have  as  a  Pre-condition:  not  AP  = 
sender. 

In  contrast,  if  the  constraint  specifies  that  some  particular  approver  may  not  approve 
claims  involving  a  claimant  who  has  not  paid  any  outstanding  fees,  this  may  not  be 
verifiable  when  the  approver  is  selected,  since  the  approval  process  may  take  a  significant 
amount  of  time,  in  which  the  claimant’s  account  may  become  overdue.  Such  a  condition 
therefore  needs  to  be  specified  as  a  post-condition,  to  be  checked  at  completion  of  the 
approval  activity.  This  could  be  implemented  by  adding  somewhere,  after  getting  the 
response  of  the  approver  and  before  sending  of  messages  to  cashier  and  claimant,  the 
following  to  the  code  of  the  trigger  submission:  check.duG.fees (sender). 

If  the  constraint  specifies  that  the  approver  may  not  be  married  to  the  claimant,  the 
intention  may  be  that  they  should  not  be  married  at  any  point  during  the  approval  process 
(which  may  take  a  while!).  It  is  clear  that  neither  a  pre-condition,  nor  a  post-condition  (or 
a  combination  of  the  two)  fully  checks  this  constraint.  (A  post-condition  is,  in  principle 
usable,  if  a  history  of  marriage(s)  is  kept;  it  may,  however,  postpone  handling  of  the 
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situation  longer  than  necessary.)  We  call  these  constraints  through-constraints.  In  our  case 
they  could  be  implemented  in  the  Mokum  script  of  approver,  instead  of  verify  .claim,  we 
could  have:  verify.claim_and_checkjspouse.  Of  course  in  the  body  of  this  procedure 
the  actual  checking  needs  to  be  programmed  —  for  example  to  inspect  a  database. 

4.3  Roles  versus  individuals 

We  assume  that  messages  in  a  workflow  system  are  routed  to  any  qualifying  individual 
operating  in  an  appropriate  role.  It  is,  however,  possible  that  in  exceptional  cases,  the 
message  needs  to  be  directed  at  a  particular  individual  acting  in  a  role.  A  particular 
approver  may,  for  example,  be  requested  to  deal  with  a  particular  claim  because  of  expert 
knowledge  the  approver  has.  Another  example  is  the  “not-married”  constraint  above. 

Where  previous  constraints  dealt  with  avoiding  of  conflicts,  it  is  also  sometimes  nec¬ 
essary  that  people  co-operate  to  perform  a  given  action.  Consider  a  bank  safe  that  is  to 
be  opened  only  after  two  distinct,  authorised  subjects  have  requested  it.  It  is  clear  that 
this  requirement  is  easily  handled  in  the  current  approach  by  simply  requiring  the  related 
actions  from  two  ‘keybearers’  (who  have  to  be  distinct  individuals)  within  an  acceptably 
short  time  period  (enforced  with  deadlines).  The  Alter-ego  concept  gives  us  this  essential 
capability  of  distinction  between  and  identifying  certain  individuals 

4.4  Role  administration 

As  mentioned  in  section  3,  a  Workflow  system  is  a  very  dynamic  system.  Roles  need  to  be 
created,  deleted  changed,  etc.  The  administration  of  roles  is  therefore  a  critical  component 
in  the  implementation.  For  that  purpose  we  assume  the  existence  of  the  Workflow  security 
administrator  (called  above:  Application  administrator).  This  administrator  handles  the 
following  tasks: 

1.  It  authenticates  each  new  user  and  creates  for  her  the  appropriate  role. 

2.  It  is  consulted  whenever  the  dynamics  of  the  situation  requires  the  change  of  roles. 
For  example,  if  for  a  particular  claim,  the  Expert  role  is  required,  the  administrator 
is  consulted  to  check  whether  the  current  Approver’s  alter-ego  can  amplify  its  role 
to  become  an  Expert.  If  not,  it  checks  whether  there  is  already  an  active  Expert  role 
in  the  system  and  if  there  is,  sends  to  the  Approver  node  its  identity,  and  if  it  does 
not  exist,  it  wiU  wait  for  the  right  alter-ego  to  be  instantiated  and  authenticated 
with  this  Role. 

We  see  the  Roles/ Alter-ego  administration  as  an  essential  part  of  the  implementation. 
This  implementation  can  use  the  Distributed  Mokum  architecture  described  next. 
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4.5  Distributed  Mokum 

A  first  version  of  Distributed  Mokum  [Radu,  Dehne  &  van  de  Riet]  is  currently  being 
tested.  This  version  of  Mokum  can  be  run  at  several  sites.  The  objects  can  commu¬ 
nicate  with  local  objects  and  with  ‘global’  objects.  An  interface  has  been  written  in  Java 
which,  using  the  socket  mechanism,  cares  for  the  necessary  communication.  Each  Mokum 
program  has  its  own  Administrator,  which  not  only  takes  care  of  the  external  communica¬ 
tion  problems,  but  also,  belt  in  a  very  limited  fashion,  of  security  problems.  It  is  a  kind  of 
Security  Administrator.  We  are  still  looking  at  the  problem  to  concentrate  such  security 
administrative  duties  in  an  applet.  In  this  implementation  an  applet  is  being  used  for  the 
man- machine  interface  between  a  user  and  the  Mokum  program. 

How  to  implement  a  Mokum  system  which  is  truly  distributed,  where  objects  may 
consist  of  subobjects  residing  at  different  sites  and  where  addressing  these  subobjects  is 
completely  trajisparent,  is  still  a  subject  of  study. 

5  Supporting  technologies 

In  order  to  securely  implement  the  system  described  above,  it  is  necessary  to  ensure  that 
the  underlying  infrastructure.  Secure  communication  protocols  are  obviously  essential. 
Before  we  review  such  protocols,  we  briefly  consider  CORBA  to  support  the  implementa- 
tion  described  above. 

5.1  Implementation  based  on  CORBA 

CORBA  which  is  becoming  one  of  the  major  standards  for  Object-oriented  software  sys¬ 
tems  has  recently  published  the  Security  reference  model  specifications  [OMG96].  Many 
of  our  concepts  map  directly  into  the  CORBA  architecture.  Our  Alter-ego  will  represent 
a  “principal”  in  CORBA.  In  terms  of  protecting  messages  and  assuring  message  integrity, 
CORBA  provides  some  facilities,  but  in  our  opinion  these  facilities  are  redundant  in  Cy¬ 
berspace  because  of  the  existence  of  secure  communication  protocols  (see  section  5.2). 

The  active  customized  code  for  each  participant  will  be  coded  within  the  “access  de¬ 
cision  functions”  of  CORBA,  and  is  enforced  automatically  by  the  ORB  (Object  Request 
Broker)  before  any  Object  invocation.  Auditing  can  also  be  specified  by  the  CORBA’s 
model  and  also  implemented  by  the  ORB.  In  addition  Non-Repudiation  services  are  also 
provided  for  reasons  of  Accountability.  Although  Roles  as  used  in  this  paper  are  not  sup¬ 
ported  in  the  CORBA  model,  other  means  such  as  Domains  may  be  used  to  implement 
them.  The  problem  of  administration  of  authorization  information  is  discussed  in  the 
[OMG96],  but  a  full  specification  is  not  given. 
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Since  the  security  reference  model  is  not  yet  implemented  and  not  even  accepted, 
one  can  resort  to  existing  CORBA  services  to  implement  our  model.  One  such  approach 
can  follow  the  scheme  suggested  by  Sheth  in  his  CORBA  based  Workflow  architecture 
[Miller  et  al96].  In  this  architecture,  there  is  a  Task  manager  associated  with  every  task 
(object  in  our  case),  which  executes  the  interface  code  defined  in  the  IDL  file.  We  will 
need  to  add  to  such  a  task  the  active  security  checks  and  the  ability  to  re-initialize  its 
parameters  each  time  it  receives  a  message  from  the  Application  administrator  requiring 
it. 


5.2  Using  Secure  communication  protocols 

The  recent  literature  and  the  World  Wide  Web  has  much  information  about  secure  commu¬ 
nication  protocols.  Lipp  and  Hassler  [Lipp  &  Hassler96]  present  a  survey  of  such  protocols. 
They  can  be  at  the  message  level  such  as  Netscape  SSL  or  Microsoft  PCT  [PCT96]  or  de¬ 
pend  on  the  requirements  of  the  application  which  is  HTTP  or  higher,  such  as  SHTTP. 
SHTTP  supports  end-to-end  secured  transactions  which  can  be  initiated  symmetrically, 
so  that  servers  and  clients  are  treated  equally  with  respect  to  their  preferences.  Message 
protection  may  be  provided  by  signing,  authenticating,  or  encrypting  the  message,  or  by 
applying  any  combination  of  these.  There  are  basically  two  types  of  messages  in  the  above 
described  Workflow  system.  There  are  the  standard  messages  between  the  participants’ 
Alter-egos.  These  messages  are  assumed  to  contain  the  Sender’s  pair  (Role,  Alter-ego) 
and  the  receiver’s  Role.  The  integrity  of  messages  provided  by  the  lower  level  protocols 
is  sufficient.  Once  the  Alter-ego  is  authenticated,  and  a  Role  is  assigned,  the  pair  (Role, 
Alter-ego)  is  assumed  to  be  part  of  every  message  (e.g.  in  its  header)  and  this  is  sufficient 
for  applying  the  higher-level  customized  security  checks. 

The  existence  of  the  pair  (Role,  Alter-ego)  —  actually  the  triple  (Site,  Role,  Alter-ego) 
—  can  also  be  used  by  security  firewalls.  The  firewaU  will  be  associated  with  the  workflow 
application  and  wUl  reject  any  message  which  does  not  have  in  the  header  the  (Alter-ego, 
Role)  pair. 

Another  problem  is  the  protection  of  documents  which  may  be  required  by  the  Expert 
system  to  approve  the  claim.  One  need  to  authenticate  the  validity  of  these  documents 
and  just  message  protection  is  not  sufficient.  A  scheme  similar  to  the  one  suggested  to 
protect  documents  on  the  Web,  i.e.  CCI-PGP  scheme  can  be  used  for  that  [Weeks96]. 

6  Conclusions 

This  paper  considered  security  in  a  workflow  system.  It  has  been  argued  that  the  notion  of 
Alter-egos  is  central  in  such  workflow  systems.  The  participation  of  an  Alter-ego  in  each 
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message  enables  the  complete  authentication  and  some  specific  individual-based  checks 
that  are  required  in  such  an  environment. 

It  is  clear  from  the  previous  section  that  different  implementation  alternatives  do  not 
only  hold  different  advantages,  but  are  based  on  different  trusted  components:  Note  in 
particular  the  trust  placed  in  the  various  actors  in  the  case  of  the  Mokum  implementation. 

It  is  clear  that  the  role  concept  in  workflow  system  holds  particular  benefits.  Additional 
research  needs  to  be  done  to  investigate  this  potential  —  in  particular  for  third  generation 
workflow  systems. 
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Abstract 

In  this  paper,  we  propose  a  two-tier  indexing  scheme  for 
multilevel  secure  database  systems,  primarily  with  the  in¬ 
tent  of  improving  query  response  time  and  reducing  the  stor¬ 
age  required  for  indexing.  At  the  bottom  tier,  our  scheme  re¬ 
quires  separate  single-level  indices,  one  for  each  partition 
of  the  multilevel  relation;  at  the  top  tier,  our  scheme  requires 
a  coarse  multilevel  index  consisting  of  only  those  key  val¬ 
ues  from  the  single-level  indices  that  are  necessary  to  direct 
a  query  to  the  appropriate  single-level  index.  Our  scheme 
seems  suitable  for  both  single-level  and  range  queries.  We 
pnn  e  this  claim  for  trees  by  providing  a  detailed  per- 
fomuince  analysis  for  this  index  structure.  We  also  give  the 
algorithms  for  inserting  and  deleting  key  values,  as  well  as 
for  .searching  for  a  key  value,  in  the  proposed  index  struc¬ 
ture. 


1  Introduction 

Database  systems  often  index  a  relation  to  make  access 
to  the  relation's  tuples  faster  than  is  possible  by  a  sequential 
.scan  of  the  entire  relation.  An  index  is  created  on  some  in¬ 
dexing  field  (typically  the  primary  key)  of  the  relation;  the 
index  file  stores  each  value  of  the  indexing  field  together 
with  a  pointer  to  the  block  in  the  physical  storage  that  con¬ 
tains  the  record  with  that  field  value.  The  index  file  being 
much  smaller  than  the  relation,  can  be  searched  much  faster 
than  the  latter. 

This  efficiency  is  not  always  achieved  in  trusted 
database  management  systems  (DBMSs)  (e.g.,  Informix 
OnLine/Secure  [4],  Trusted  Oracle  [9],  and  Sybase  Secure 
SQL  Server  [10]).  This  is  because  they  provide  only  two 
indexing  options:  either  multiple  single-level  indices,  a  sep¬ 
arate  index  for  data  at  each  security  level  or  a  trusted  mul¬ 
tilevel  global  index  over  all  data  in  the  multilevel  relation. 
While  the  single-level  index  structure  works  well  for  those 


SELECT  EMP.NAME,  EMP. SALARY 
FROM  EMP 

WHERE  ROWLABEL  =  'SECRET' 

Figure  1.  A  Single-level  Query 


queries  where  users  specify  the  security  level  of  the  data 
(we  refer  to  these  as  single-level  queries),  it  performs  poorly 
if  the  security  levels  of  the  data  are  left  unspecified  by  the 
users.  A  common  kind  of  query  is  the  range  query  in  which 
the  user  gives  a  desired  range  of  values  for  the  indexing 
field  and  wishes  to  retrieve  all  those  tuples  whose  values  in 
the  indexing  field  fall  within  the  desired  range;  obviously, 
single-level  indices  perform  dismally  in  the  case  of  range 
queries  [8].  The  problem  with  the  multilevel  index  is  that 
while  it  is  more  efficient  for  answering  range  queries  than 
the  single-level  index  structure,  it  is  less  efficient  for  single- 
level  queries.  Example  single-level  and  range  queries  are 
given  in  figures  1  and  2,  respectively. 

In  this  paper,  we  propose  a  two-tier  indexing  scheme 
which  retains  the  advantages  of  both  kinds  of  index  struc¬ 
tures.  We  maintain  multiple  single  level  indices,  one  for 
each  security  level  and  construct  a  trusted  coarse  multilevel 
index  over  these  single  level  indices.  The  coarse  index  con¬ 
sists  of  only  those  key  values  from  the  single-level  indices 
that  are  necessary  to  direct  a  query  to  the  appropriate  single- 
level  index.  Our  scheme  seems  suitable  for  both  single-level 
queries  as  well  as  range  queries.  To  prove  this  claim,  we 
choose  B"*"  tree  as  the  indexing  scheme,  and  provide  a  de¬ 
tailed  performance  analysis  for  this  index  structure. 

The  rest  of  the  paper  is  organized  as  follows.  Section  2 
describes  the  related  work.  Section  3  contains  an  example 
to  illustrate  the  different  indexing  schemes  and  their  rela¬ 
tive  merits.  Section  4  gives  an  informal  description  of  the 
different  steps  involved  in  creating  and  managing  our  index 
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SELECT  EMP.NAME,  EMP. SALARY 
FROM  EMP 

WHERE  SALARY  >=  20000  and 
SALARY  <=  50000 

Figure  2.  A  Range  Query 


scheme.  (The  detailed  algorithms  have  been  left  out  for  lack 
of  space.)  In  sub-section  4.4,  we  describe  a  variant  of  our 
indexing  scheme  which  is  cheaper  to  use  in  terms  of  stor¬ 
age,  but  sometimes  is  more  expensive  in  terms  of  querying. 
In  section  5,  we  give  details  of  our  performance  model,  and 
summarize  the  results  of  our  analysis  in  section  6.  Section 
7  concludes  the  paper  with  a  discussion  of  our  future  work. 

2  Related  Work 

For  obvious  reasons,  indexing  schemes  have  received 
wide  attention  from  database  researchers  (e.g.,  see  [3, 5, 6]). 
The  notion  of  B-tree,  a  variant  of  B"*"  trees,  was  first  intro¬ 
duced  by  [1].  Algorithms  for  searching  and  insertion  and 
deletion  in  B-trees  and  B+  trees  can  be  found  in  [5]  while 
the  issue  of  concurrent  operations  on  B-trees  has  been  dealt 
with  in  [2]  and  [7]. 

To  the  best  of  our  knowledge,  the  only  work  that  deals 
with  indexing  for  MLS  database  is  by  [8].  They  maintain 
multiple  single-level  indices,  one  index  for  each  security 
class.  To  facilitate  range  queries,  they  add  cross  links  in 
the  index  structure;  a  cross  link  is  a  pointer  from  a  block 
Bj  to  another  block  B^  at  a  lower  security  level,  signifying 
that  the  range  of  data  in  Bj  contains  a  subset  of  the  range 
of  data  in  B^.  Cross  links  help  in  the  evaluation  of  a  range 
query  by  allowing  immediate  access  to  the  block  containing 
the  relevant  data  in  the  next  index  once  an  initial  index  has 
been  searched  for  values  in  the  range. 

Although  the  concept  of  cross  links  is  novel  and  seems  to 
speed  up  the  evaluation  of  range  queries,  the  cost  of  main¬ 
taining  the  cross  links  appears  to  be  very  high.  The  authors 
assume  that  the  security  levels  in  the  system  form  a  total  or¬ 
der;  it  seems  that  for  any  arbitrary  partial  order  of  security 
classes  the  overhead  of  maintaining  cross  links  may  be  pro¬ 
hibitive.  In  fact,  the  authors  acknowledge  that  it  is  impos¬ 
sible  to  predict  whether  indexing  with  cross  links  performs 
better  or  worse  than  the  single-level  index  structures. 

3  Motivation  for  Our  Approach 

Consider  the  multilevel  index  file  F  shown  below: 


Key  Values:  5(TS),  7(TS),  8(TS),  9.5(TS),  1 0(S), 
12(S),  15(C),  13(0,7.5(0,7.3(0, 
20(S),  21(S),  22(S),  9(TS),  7.6(C), 
11(S),6(TS),  14(0 

A  single-level  index  structure  for  this  file  is  shown  in 
figure  3.  If  a  user  query  does  not  specify  the  security  level 
of  the  data  to  be  retrieved,  then  this  query  must  be  processed 
at  least  at  those  partitions  whose  levels  are  dominated  by  the 
level  of  the  search  request.  This  is  useful  if  the  search  yields 
data  at  multiple  levels.  However,  if  the  relevant  records  are 
found  in  only  a  small  number  of  the  partitions  or  only  at  a 
single  level  then  the  effort  in  searching  indices  at  the  other 
partitions  is  wasted.  This  is  particularly  wasteful  if  there  are 
a  large  number  of  security  levels  and  hence  a  large  number 
of  partitions  to  search.  Note  that  single-level  indices  have 
the  advantage  that  they  are  easier  to  maintain  and  can  be 
maintained  locally  at  each  partition. 

The  alternate  approach  is  to  maintain  a  global  multilevel 
index  for  the  index  file  F  as  shown  in  figure  4.  It  performs 
well  for  range  queries,  but  not  for  single-level  queries  be¬ 
cause  the  global  index  is  created  over  all  key  values. 

We  try  to  combine  the  advantages  of  both  the  indexing 
schemes  with  our  approach.  The  index  structure  at  the  bot¬ 
tom  tier  consists  of  a  collection  of  B"*"  trees,  one  for  each 
single-level  partition  of  the  multilevel  relation.  On  top  of 
these  single-level  indices,  we  maintain  a  multilevel  B+  tree 
index  consisting  of  selected  key  values  from  each  of  the 
single-level  indices.  Unlike  the  global  indexing  scheme,  the 
top  tier  index  is  really  a  coarse  index;  it  contains  only  those 
values  that  are  necessary  to  direct  a  query  to  the  appropri¬ 
ate  single-level  index.  Since  we  do  not  want  to  repeat  the 
search  through  a  single-level  index  starting  from  its  root,  the 
pointer  from  the  leaf  node  of  the  coarse  level  index  points 
to  the  leaf  node  of  the  relevant  single-level  index. 

Figure  5  shows  the  proposed  index  structure  for  the  index 
file  F .  Single-level  queries  are  evaluated  by  bypassing  the 
coarse  index  and  going  directly  to  the  appropriate  single- 
level  index.  All  other  queries  are  evaluated  at  the  coarse 
level  first  and  then  directed  to  the  leaf  nodes  of  only  those 
single-level  indices  that  may  contain  the  desired  key.  The 
following  section  describes  the  indexing  scheme  in  greater 
details. 

4  The  Indexing  Scheme 

There  are  several  possible  variations  of  our  coarse  index 
structure.  Each  trades  off  the  amount  of  storage  required  by 
the  index  structure  for  the  cost  of  a  query.  We  will  describe 
only  one  of  the  index  structures,  which  we  choose  to  call 
Coarse  Index  I,  in  detail;  this  will  give  an  understanding 
of  the  key  issues  involved  in  the  proposed  structure.  We 
will  briefly  describe  a  variant,  called  the  Coarse  Index  2,  to 
give  some  idea  about  different  coarse  index  structures  and 


171 


Key  Valueji:  5  (TS),  7  (TS).  8  (TS),  9.5  (TS),  10  (S).  12  (S),  15(C),  13  (C).  7.5  (C). 
7.3  (C).  20  (S),  21  (S),  22  (S).  9  (TS),  7.6  (C),  1 1  (S),  6  (TS),  14  (C) 


-H  |g.TS|  |9.TS|  I  I  |io.s|  |n!sl  li2.sl+-rno: 


Top  Secret  (TS)  Secret  (S) 


Confidential  (C) 


Figure  3.  Single-level  Index  for  Each  Partition 


Key  Values:  5  (TS),  7  (TS).  8  (TS),  9.5  (TS),  10  (S),  12  (S),  15  (C),  13  (C),  7.5  (C). 
7.3  (C),  20  (S).  21  (S),  22  (S),  9  (TS),  7.6  (C),  1 1  (S),  6  (TS),  14  (C) 


then  provide  a  detailed  performance  analysis  of  both  index 
structures.  A  discussion  of  other  variations  of  our  index 
structures  will  be  the  content  of  a  future  work. 

Below,  when  we  speak  of  a  key,  we  assume  that  it  also 
includes  the  security  label  associated  with  the  key. 

4.1  Inserting  a  key  in  the  index 

A  key  Kj  is  inserted  in  the  coarse  index  as  follows: 

1 .  Search  the  relevant  single-level  index  structure  to  find 
out  if  the  key  Kj  being  inserted  is  already  present.  If 
so,  return  with  an  insertion  violation  error. 

2.  If  Kj  can  be  inserted,  then  insert  it  at  the  relevant 
single-level  index. 

3.  Determine  if  Ki  and/or  any  other  key  Kj  from  this 
single-level  index  need  to  be  inserted  at  the  coarse 
level  index: 

(a)  If,  after  Kj  is  inserted  at  the  single-  level  index, 
K,  is  the  largest  key  in  the  node  into  which  it 
was  inserted,  then  Kj  needs  to  be  inserted  at  the 
coarse  level  index. 

(b)  If  the  node  of  the  single-level  index  in  which  K, 
is  to  be  inserted  has  to  be  split  into  two  nodes  to 
accommodate  Ki,  then  the  largest  keys  Kj  and 
Ki  in  each  of  the  two  nodes  need  to  be  inserted 
at  the  coarse  level  index.  Note  that  a  situation 
can  arise  where  one  of  Kj  or  K*  was  the  largest 
key  value  in  the  node  before  it  was  split.  In 
that  case,  this  key  value  is  already  present  in  the 
coarse  index  and  need  not  be  re-inserted.  Fur¬ 
ther.  one  of  Kj  and  K*  may  be  the  same  as  Kj; 
in  this  case  the  key  will  be  inserted  according  to 
step  3a  above. 

(c)  In  all  other  cases  there  is  no  need  to  insert  any 
value  at  the  coarse  index  and  insertion  process 
is  complete. 

4.  If  a  key  K/  (Kj  and/or  some  other  key)  from  step  3 
above  is  being  inserted  in  the  coarse  index,  determine 
if  there  is  a  need  to  get  a  key  from  a  single  level  index 
other  than  the  one  in  which  K/  belongs  (This  process 
is  necessary  because  there  is  intermingling  among  the 
keys): 

(a)  If  K;  is  to  be  inserted  between  keys  K^  and 
K„  in  the  coarse  index,  then  visit  the  single¬ 
level  index  pointed  to  by  the  pointer  P„  (for  key 
Kn).  provided  this  single-level  index  is  different 
from  the  one  that  K;  is  in.  Note  that  this  pointer 
points  to  key  values  that  are  less  than  or  equal 


to  K„  but  greater  than  K^  •  Thus,  there  is  a  pos¬ 
sibility  that  there  are  keys  in  the  index  pointed 
to  by  K„  that  are  less  than  K/.  In  such  a  case, 
get  the  largest  of  such  keys  from  the  single-level 
index  and  insert  it  along  with  K/  in  the  coarse 
level  index. 

(b)  If  K;  is  to  be  inserted  as  the  smallest  key  in  the 
coarse  index  and  if  the  pointer  corresponding  to 
the  current  smallest  value  in  the  coarse  index 
points  to  a  single-level  index  different  from  the 
one  in  which  K/  is,  then  as  in  step  4a,  get  the 
largest  key  value  in  this  index  that  is  smaller 
than  K/  and  insert  it  in  the  coarse  index  along 
with  K,. 

(c)  In  all  other  cases,  insert  K/  only  in  the  coarse 
index. 

5.  Once  all  the  relevant  keys  have  been  inserted  in  the 
coarse  index,  along  with  pointers  to  the  correspond¬ 
ing  leaf  nodes  of  the  single-level  indices  containing 
these  keys,  the  insertion  process  terminates. 

4.2  Deleting  a  key  from  the  index 

We  now  describe  how  a  key  is  deleted  from  the  index 
structure.  An  important  feature  to  note  in  this  deletion  pro¬ 
cess  is  that  if  a  key  Kj  is  deleted  from  the  coarse  index,  we 
may  need  to  insert  another  key  value  Kj  from  the  single- 
level  index  containg  Kj,  such  that  Kj  is  the  largest  value 
smaller  than  Kj  in  the  single-level  index.  If  Kj  occupies 
the  same  position  in  the  coarse  index  that  Kj  originally  did, 
then  we  will  just  replace  Kj  with  Kj.  Otherwise,  we  may 
have  to  insert  a  second  key  K*  from  a  different  single-level 
index  as  was  required  in  the  insertion  algorithm. 

1.  If  Kj  is  not  present  in  the  single-level  index,  then  re¬ 
turn  with  a  deletion  violation  error. 

2.  If  key  Kj  can  be  deleted,  then  delete  it  from  the 
single-level  index. 

3.  If  Kj  is  in  the  coarse  index,  then  delete  Kj  from  the 
coarse  index.  Get  a  replacement  for  Kj  from  the 
single-level  index  that  Kj  was  in  as  follows: 

(a)  If  the  deletion  process  results  in  the  merging 
of  two  adjacent  leaf  nodes,  then  let  Kj  be  the 
largest  key  value  in  the  merged  node. 

(b)  If  the  deletion  process  does  not  result  in  any 
merging  of  leaf  nodes,  then  let  Kj  be  the  largest 
key  smaller  than  Kj  in  the  node  that  contained 
Kj. 

(c)  Kj  is  Kj’s  replacement  in  the  coarse  index  pro¬ 
vided  Kj  is  not  already  present  in  the  coarse  in¬ 
dex. 
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4.  If  Kj  is  not  already  present  in  the  coarse  index,  insert 
Kj  following  the  insertion  algorithm  described  ear¬ 
lier.  Note  that,  as  in  the  insertion  algorithm,  this  step 
may  require  a  second  key  K*  from  some  single-level 
index  different  from  the  one  containing  Kj,  to  be  in¬ 
serted  in  the  coarse  index. 

5.  This  completes  the  key  deletion  process. 

4.3  Searching  for  a  key  in  the  index 

The  search  process  is  straightforward  and  works  as  fol¬ 
lows: 

1 .  If  the  key  Kj  being  searched  for  is  in  the  coarse  index, 
locate  Kj  in  a  leaf  node  of  the  coarse  index.  Follow 
the  pointer  Pj  corresponding  to  Kj  to  a  leaf  node  of 
some  single-level  index.  Locate  Kj  in  this  leaf  node 
and  return  with  success. 

2.  If  Kj  is  not  in  the  coarse  index,  locate  the  smallest  key 
value  Kj  in  the  coarse  index  that  is  larger  than  Kj. 

(a)  Follow  the  pointer  Pj  corresponding  to  Ky  to  the 
leaf  node  of  some  single-level  index  containing 

Kj. 

(b)  Search  for  Kj  in  this  leaf  node.  If  found,  re¬ 
turn  search  successful;  otherwise  return  search 
unsuccessful. 

4.4  A  variation  of  the  coarse  index  structure 

Refer  to  the  key  insertion  procedure  described  in  section 
4. 1 .  Note  that  whenever  a  leaf  node  at  a  single-level  index 
is  split,  we  insert  a  key  value  from  each  of  the  resulting  two 
nodes  in  the  coarse  index.  This  results  in  the  coarse  index 
having  at  least  one  entry  in  its  leaf  nodes  for  each  of  the 
leaf  nodes  of  any  single-level  index.  To  find  a  key  at  a  leaf 
node  of  a  single-level  index  after  the  search  has  traversed 
the  coarse  index  structure,  it  requires  a  search  of  only  one 
leaf  node  of  the  single-level  index. 

In  this  variation,  henceforth  called  Coarse  Index  2,  we  no 
longer  require  that  for  every  leaf  node  of  a  single-level  index 
there  be  at  least  one  entry  in  the  coarse  level  index.  Instead, 
for  a  chain  of  leaf  nodes  Nj,  Nj  ...  Np  in  that  order,  we 
include  some  value  Kp  of  Np  at  the  coarse  index  and  make 
the  pointer  of  Kp  point  to  leaf  node  Nj.  As  a  result  all  key 
values  in  Nj,  Nj, ...  Np  up  to  the  value  Kp  are  pointed  at 
by  the  pointer  of  Kp  in  the  coarse  index.  To  find  a  key  value 
that  is  less  than  or  equal  to  Kp  in  the  single-level  index, 
we  follow  Kp’s  pointer  in  the  coarse  index  to  leaf  node  Nj, 
then  sequentially  search  the  chain  of  leaf  nodes  till  we  either 
come  to  the  desired  value  or  get  to  Kp. 


Figure  6.1  shows  the  Coarse  Index  1  and  figure  6.2  shows 
its  variant.  Coarse  Index  2,  for  the  same  set  of  key  values. 

Note  that  during  insertion  or  deletion  process,  we  have  to 
find  out  if  there  is  a  value  in  some  other  single-level  index 
that  needs  to  be  inserted  at  the  coarse  index,  just  as  we  did 
for  Coarse  Index  1 .  This  alternate  structure  reduces  the  size 
of  the  coarse  index  considerably  at  the  expense  of  increased 
cost  of  query  (that  is  more  number  of  blocks  need  to  be 
accessed  to  get  to  the  key).  The  complete  analysis  of  both 
the  structures  is  given  in  section  5  below. 

5  Performance  Analysis 

In  order  to  assess  the  suitability  of  the  proposed  indexing 
schemes  for  specific  secure  database  applications,  it  is  im¬ 
portant  to  determine  the  costs  and  benefits  of  the  schemes. 
We  now  derive  analytical  expressions  for  two  chosen  per¬ 
formance  metrics  for  the  single-level  indices,  the  Global  in¬ 
dexing  scheme,  and  the  two  Coarse  Index  methods. 

5.1  Analytical  model 

We  model  an  MLS  database  system  in  terms  of  the  fol¬ 
lowing  parameters: 

•  The  number  of  keys  (or  records)  in  the  system  (£») 

•  The  number  of  distinct  security  levels  (L) 

•  The  distribution  of  the  keys  among  the  different  secu¬ 
rity  levels (i?i  :R2  : 

•  The  expected  number  of  keys  in  a  given  query  range 
(Q)  (Here,  we  assume  that  query  is  a  read-only  oper¬ 
ation  which  specifies  a  range  of  keys  to  be  searched.) 

•  The  order  of  the  B+  trees  (P) 

•  The  average  fullness  factor  for  the  nodes  in  the  B+ 
trees  (F)  (that  is,  on  an  average  only  a  fraction  F  of 
the  entries  in  any  given  index  tree  are  filled.  Obvi¬ 
ously,  1  >  F  >  0.5  by  the  definition  of  the  B"*"  tree.) 

•  The  average  size  of  the  clusters  (Si, 52,..., Si, ), 
which  is  determined  by  the  interleaving  of  keys  of 
different  security  levels. 

The  model  parameters  are  summarized  in  Table  1 .  The 
following  example  database  illustrates  the  notation.  Since  it 
is  also  used  as  a  base  case  in  the  results  section,  to  describe 
the  effect  of  different  parameters  on  the  performance,  we 
refer  to  it  as  the  base  database.  The  base  database  has  one 
million  keys  (D  =  10®),  with  four  distinct  security  levels 
TS),  with  the  keys  being  distributed 
in  the  ratio  of  4:3:2: 1  among  the  four  classes  (where  4, 3, 2, 

1  correspond  to  levels  U,  C,S,  and  TS,  respectively).  The 
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Key  Values:  3  (TS).  7  (S),  5  (JS).  4  (C).  1 1  (S).  2  (TS).  1  (TS),  2,5  (C),  12  (S),  7.5  (TS).  13  (S).  4.5  (C).  2.3  (C) 


Key  Values:  3  (TS).  7  (S).  5  (TS).  4  (C),  1 1  (S).  2  (TS).  1  (TS).  2.5  (C).  12  (S).  7.5  (TS).  13  (S).  4.5  (C).  2.3  (C) 

Coarse  Level  Index 


Figure  6.  The  Two  Variants  of  the  Coarse  Index  Structure  on  the  Same  Keys 


Mnemonic 

D 

Description 

Total  number  of  keys 

Base  case 

— - 

L 

Security  levels 

^{U,C,S,TS) 

Ri 

Portion  of  level  i  keys 

Ri  =A,R2=Z,R3=2,Ri  =  1 

1000 

Q 

Average  number  of  keys  within  a  given 
query  range 

p 

Order  of  the  tree 

10 

F 

Node  fullness  factor 

1.0 

Si 

Average  size  of  level  i  cluster 

5i  =  100,  Si  =  75,  S3  =  50,  S4  =  25 

Table  1.  Model  parameters  and  Base  case 
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query  under  consideration  covers  1000  keys  {Q  =  1000). 
The  order  of  the  tree  is  10  (P  =  10)  and  each  node  is 
assumed  to  be  full  (P  =  1.0)  at  the  time  the  query  arrives. 
The  average  cluster  sizes  of  the  U,  C,  S,  and  TS  classes  are 
100,  75,  50,  and  25  keys,  respectively.  For  example,  on  the 
average,  the  number  of  consecutive  keys  with  security  level 
U  is  100.  The  data  for  the  base  case  is  also  included  in  Table 
1. 

From  the  basic  model,  we  can  derive  the  following  fac¬ 
tors  which  are  used  in  the  rest  of  the  analysis: 

•  Total  number  of  keys  of  level  i,  /:,  =£)•  — 

•  The  average  number  of  clusters  of  level  i,  c,  =  ki /Si 

•  Average  number  of  keys  in  the  query  range  corre¬ 
sponding  to  level  2,  q,  =  Q  ■  — 

•  Average  number  of  clusters  in  the  query  range  corre¬ 
sponding  to  level  2,  r,  =  Qi/Si 

•  The  total  number  of  clusters  in  the  query  range,  = 

ZL,  r. 

Note  that  we  use  upper  case  letters  for  the  basic  model 
parameters  and  lower  case  letters  for  derived  parameters. 

5.2  Performance  metrics 

To  compare  the  performance  of  different  index  methods, 
wc  chose  two  metrics:  size  of  the  index  table  and  the  cost 
of  executing  a  query.  These  are  defined  as  follows: 

1  Size  of  index  table:  We  measure  the  size  in  terms 
of  the  number  of  nodes  (often  referred  to  as  blocks 
in  literature  [3])  of  the  tree  representing  the  in¬ 
dex  structure.  Obviously,  the  smaller  the  size,  the 
more  suited  it  is  for  database  applications,  especially 
those  running  on  limited  memory  machines.  We  de¬ 
note  this  metric  by  q.  The  metric  is  computed  for 
the  Global  index  (qi),  the  Coarse  Index  1  (Q2),  the 
Coarse  Index  2  (03),  and  the  four  single-level  indexes 

2.  Cost  of  query  execution:  During  the  execution  of  a 
range  query,  an  index  is  searched  to  access  the  actual 
data.  Since  the  number  of  data  blocks  to  be  accessed 
for  executing  the  query  execution  is  independent  of 
the  index  method,  we  have  defined  the  cost  of  query 
execution  only  in  terms  of  the  number  of  nodes  (or 
blocks)  of  the  index  structure  (B"^  tree)  that  would  be 
searched  (or  accessed).  This  measure  is  well-suited 
for  comparing  the  different  indexing  methods  being 
proposed  in  this  paper.  We  denote  this  cost  by  0.  The 
metric  is  computed  for  queries  at  the  four  different 


levels.  It  is  assumed  that  a  query  at  a  given  level 
needs  to  access  data  corresponding  to  its  own  level 
as  well  as  those  dominated  by  it.  For  example,  an  S 
level  query  accesses  data  related  to  U,  C,  and  S  levels. 
Accordingly,  the  cost  of  a  query  at  S  level  includes 
the  cost  of  accessing  indexes  at  U,  C,  and  S  levels. 
We  compute  this  metric  for  the  Global  index  (/?i),  the 
Coarse  Index  1  (02),  the  Coarse  Index  2  (03),  and  the 
four  single-level  indexes  (0^,  0c,  0s,  AJ. 

5.3  Evaluation  of  the  index  size  metric 

In  order  to  estimate  the  size  of  a  B“^  tree,  we  first  need 
to  determine  its  height.  Given  the  order  (P),  the  fullness 
factor  (F),  and  the  number  of  keys  K  (or  data  pointers  at 
the  leaf  level),  the  height  of  a  tree  (h)  can  be  estimated  to 
be  [3] 


h 


log(F.(P-l)) 

log(F-P) 


(1) 


Here,  we  assume  that  while  the  intermediate  nodes  of  a 
B"^  tree  can  hold  up  to  P  pointers  to  lower  level  nodes,  the 
leaf  node  only  holds  up  to  (P  —  1)  data  pointers  as  one  of 
the  pointers  is  used  for  pointing  to  its  right  sibling  node. 
Using  this  equation,  the  expressions  for  the  single-level  in¬ 
dices  and  the  global  index  can  be  easily  derived  by  proper 
substitutions  for  K.  These  are  summarized  in  Table  2.  The 
expressions  for  the  coarse  indices,  however,  are  more  com¬ 
plex.  We  now  derive  these  expressions. 

In  the  case  of  Coarse  Index  1,  the  leaf  node  has  one  or 
more  pointers  to  a  cluster  depending  on  whether  or  not  a 
cluster  covers  one  or  more  nodes  at  the  leaf  level  of  a  single 
index.  In  other  words,  if  a  cluster  of  keys  occupy  n  (partial 
or  full)  nodes  at  the  corresponding  index  tree’s  leaf  level, 
then  Coarse  Index  1  would  have  n  pointers  pointing  to  the 
n  nodes.  If  denotes  the  average  number  of  nodes  (or 
blocks)  that  a  cluster  of  level  i  occupies  in  the  single-level 
index  at  level  2,  then  it  can  easily  be  derived  as  follows. 


=  l  +  l(5i-l)/(F.(P-l))J  + 

(f'-iP-l)) 

where  0  represents  the  modulus  operator. 

Since  there  are  a  clusters  for  level  i.  Coarse  Index  1 
would  point  to  (cj  •  Ui)  nodes  at  the  single-level  in¬ 
dices.  Thus,  the  height  of  Coarse  Index  1  is  given  by  sub¬ 
stituting  this  term  for  K  in  Equation  (1). 

Since  Coarse  Index  2  only  has  one  pointer  to  each  of 
the  clusters,  it  would  point  to  Ci  nodes  at  the  single 
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Mnemonic 

Expression 

he 

1o^(A 

log{FP)  +  A 

\oz(kJ{F-{P-1))) 

log(F  P)  '  ^ 

106(Ar.;(p.(P-l))) 

h,4  _ 

log(PP)  1  i 

iog(K„/(p.(p-n))  .  , 

•Hs 

hi 

ho 

log(P.P)  +  1 

log(P/(P.(P-l))) 

log(P.P)  +  ^ 

ho 

log(PP)  1 

'*'6 

JOK(F.P)  +1 

Table  2.  Summary  of  Height  Computations 


Mnemonic 

Expression 

Q,, 

PP-1 

Ctr 

(P-P)'‘^-l 

PP-1 

r>_ 

(P.P)^--l 

Oils 

ai 

PP-1 

(P.P)^‘--1 

PP-1 

(p.p)'^i_l 

PP-1 

02 

(P.P)'*2_1 

PP-1 

^3 

(P.P)^3_1 

.  PP-I 

Table  3.  Summary  of  Size  computations 


indues  Hence,  its  height  is  given  by  substituting  this  value 
lor  A  in  Equation  ( 1 ).  These  are  summarized  in  Table  2. 

Once  the  height  h  is  known,  the  size  of  an  index  table 
ma>  be  computed  using  the  properties  of  the  B"*"  tree  as 


(F  •  P)*  -  1 
PP-1 


(3) 


B\  substituting  the  appropriate  term  for  h  in  Equation 
(.^  I.  we  can  compute  the  size  metric  for  the  single-level  in¬ 
dices  (o„.Or,Qs,a4s),  the  Global  index  (qj),  the  Coarse 
Index  !  (o..).  and  the  Coarse  Index  2  (qj).  These  are  sum¬ 
marized  in  Table  3. 


5.4  Evaluation  of  the  cost  of  query  execution 

Let  us  first  consider  the  case  of  single-level  indices. 
Cleariv.  given  a  U  level  query,  we  need  to  access  only  the 
index  for  I'  level.  Once  we  reach  the  leaf  node  of  the  in¬ 
dex  that  represents  the  beginning  of  the  query  range,  data 
pointers  to  other  keys  may  be  obtained  by  traversing  the 
leaf  nodes  horizontally  (using  the  right  sibling  links)  until 


the  range  is  covered.  Accordingly,  the  cost  metric  is 
given  by 


Pu 


Qu 

P-(P-l) 


(4) 


For  a  query  of  level  C,  we  need  to  access  both  keys  of 
type  U  and  C  by  repeating  search  on  single-level  indices  of 
C  and  U.  Accordingly, 


0c 


hxi  -b 


LP-(P-1)J 


+  /if  -b 


9c 


LP-(P-i) 


(5) 


We  can  derive  similar  expressions  for  0^  and  0ts.  These 
are  summarized  in  Table  4. 

In  the  case  of  Global  index,  since  keys  of  all  levels  are 
present  at  the  leaf,  all  keys  in  the  query  range  (i.e.,  Q  keys) 
need  to  be  searched,  for  any  level  of  query.  Hence, 


01 


hi  -b 


Q 


P-(P-I) 


(6) 


The  same  Pi  will  be  applicable  for  all  four  types  of 
queries  (U,  C,  S,  TS). 

The  computations  of  the  cost  metrics  are  more  complex 
for  Coarse  Index  1  and  Coarse  Index  2.  First  let  us  consider 
the  case  of  Coarse  Index  I .  For  a  level  j  query,  we  need  to 
find  the  pointer  to  the  first  cluster  in  the  given  query  range 
for  level  j  and  all  levels  dominated  by  j.  Hence,  we  need 
to  search  the  leaf  nodes  of  the  coarse  index,  starting  from 
the  beginning  of  the  query  range,  going  horizontally,  until 
we  find  the  first  cluster  for  each  type  of  the  required  level. 
If  First(i)  represents  the  average  number  of  nodes  to  be 
scanned  to  find  the  first  cluster  of  level  i,  starting  from  the 
beginning  of  query  range,  then  the  number  of  nodes  scanned 
in  Coarse  Index  1,  before  all  the  required  first  cluster  point¬ 
ers  are  obtained,  is  given  by  maxvi,i<j  First(i).  In  addition, 
since  the  corresponding  single-level  indices  have  also  to  be 
scanned  horizontally,  the  cost  of  execution  is  given  by 


02,j  =  /i2  -  I  -b  max  First(t)  -b 

\<i<j  ' 


Y'  9i 

h.  F-iP- 


i<«<i 


1) 


(7) 


First(i)  may  be  computed  by  using  probabilistic  argu¬ 
ments.  (We  omit  the  details  here.)  It  is  given  as 

r<.-2ri  /*-!  .  v 

r  k-u  1 

r,  -  fc  ■  P  •  (P  -  1)  I 
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Table  4.  Summary  of  Cost  of  Query  Executions 


where  U  is  the  average  size  of  clusters  in  the  query  range 
excluding  cluster  (^1  —  '  Si) 

The  computations  for  Coarse  Index  2  are  quite  similar 
except  that  this  index  carries  only  one  pointer  per  cluster 
Thus,  the  expression  for  is  given  as  follows. 


where 
First' (i) 


6  Results 


/13  —  1  +  max  First' (f)  -f 

i<t<j  ^  ^ 


Qi 

F-(P^l) 


(9) 


To  determine  the  effect  of  different  system  parameters  on 
the  two  performance  metrics  (a  and  P)  with  different  index¬ 
ing  methods,  we  have  evaluated  the  metrics  under  different 
configurations.  Our  evaluation  methodology  is  as  follows. 

1.  The  base  case  for  the  evaluation  is  as  described  in 
Table  1. 

2.  Under  each  of  the  configurations,  we  evaluate  the 
index  size  (a)  for  the  four  single-level  indices 
(U,C,S,TS),  the  Global  index  (GI),  the  Coarse  In¬ 
dex  1  (CIl)  and  Coarse  Index  2  (CI2).  The  cost 
of  query  metric  P)  is  evaluated  for  all  four  types  of 
queries  (U,  C,  S,  TS).  For  each  type,  we  evaluate  the 
metric  when  only  the  single-level  index  for  that  type 
is  available,  as  well  as  for  the  Global  index,  and  the 
two  Coarse  indices. 


3.  To  evaluate  the  effect  of  cluster  size  on  the  metrics, 
we  vary  the  cluster  size  (5)  in  the  base  model.  In 
the  current  runs,  we  keep  the  ratio  of  the  cluster  sizes 
constant  at  4:3:2: 1  as  in  the  base  model  but  varied  the 
constant  factor  from  1  through  10®.  So  the  cluster 
sizes  were  varied  from  the  set  <  4, 3, 2,1  >  to  < 
410®,310®,210®,  10®  >.  Due  to  space  limitations, 
only  a  subset  of  the  results  are  presented  in  figures  7 
and  1 1 . 

4.  To  evaluate  the  effect  of  key  ratio  size  on  the  metrics, 

we  vary  the  key  ratio  (iZ/s)  in  the  base  model.  We 
evaluated  the  metrics  under  a  different  set  of  ratios  in¬ 
cluding  1:1:1:1,  10:1:1:1,  100:1:1:1,  1000:1:1:1 . 

1:1:1:1000.  Due  to  space  limitations,  the  results  are 
not  plotted,  but  the  summary  of  the  observations  is 
presented  below. 

5.  To  evaluate  the  effect  of  the  tree  order  (P),  we  vary 
the  order  from  2  through  500.  The  results  are  sum¬ 
marized  in  figures  8  and  12. 

6.  To  evaluate  the  effect  of  number  of  keys  (D),  we  ex¬ 
perimented  with  D  values  from  100  to  10®.  Portions 
of  the  results  are  summarized  in  figures  9  and  13. 

7.  To  evaluate  the  effect  of  the  fullness  factor  (F),  we 
changed  values  of  F  from  0.7  through  1 .0.  The  re¬ 
sults  are  summarized  in  figures  10  and  14. 

8.  To  determine  the  query  range  effect,  Q  was  varied 
from  1  through  10®.  The  observations  are  summa¬ 
rized  below. 

Following  is  a  summary  of  our  observations  regarding 
the  behavior  of  different  indexing  methods  under  different 
model  parameters. 

•  Index  size  (a):  As  mentioned  above,  index  size  is  a 
very  important  metric  since  it  determines  the  overall 
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performance  of  the  system.  If  the  index  size  is  small 
enough  that  it  can  fit  in  the  fast  memory  of  the  sys¬ 
tem,  then  quick  turnaround  times  may  be  achieved 
for  query  executions.  Following  is  a  summary  of  our 
analysis  illustrating  the  effect  of  different  system  pa¬ 
rameters  on  the  size  of  various  index  methods. 

(i)  Effect  of  Cluster  size  (Si):  Since  the  single- 
level  indices  and  the  Global  index  have  point¬ 
ers  to  individual  data  and  not  to  clusters,  this 
parameter  has  no  effect  on  their  sizes  (i.e.,  as). 
On  the  other  hand,  it  has  significant  impact  on 
the  sizes  of  Coarse  Index  1  and  Coarse  Index 
2  (see  figure  7).  As  the  size  of  the  clusters  in¬ 
crease,  the  number  of  clusters  decrease.  This 
means  that  the  number  of  elements  pointed  to 
by  Coarse  Index  2  also  decrease.  For  example, 
when  the  cluster  sizes  of  the  four  classes  were 
40,  30,  20,  10,  respectively.  Coarse  Index  2  re¬ 
quires  4938  nodes  (or  blocks).  But  when  the 
size  of  each  cluster  is  ten  times  (i.e.,  400,  300, 
200,  100),  the  size  decreases  to  494,  a  tenth  of 
the  previous.  Thus  the  Coarse  Index  2’s  size  is 
inversely  proportional  to  the  cluster  size. 

Since  Coarse  Index  1  points  to  all  nodes  that  a 
cluster  of  data  keys  occupy  at  a  single-level  in¬ 
dex,  the  relationship  between  its  size  (Q2)  and 
the  cluster  size  is  not  so  straightforward.  In 
general,  increase  in  cluster  size  would  decrease 
the  index  size  but  not  as  dramatically  as  in  the 
case  of  Coarse  Index  2.  For  example,  in  our  ex¬ 
periments,  when  the  cluster  sizes  were  changed 
from  40,  30,  20,  10  to  400,  300,  200,  100  re¬ 
spectively,  for  the  four  levels,  the  index  size  re¬ 
duced  from  20987  nodes  to  14444  nodes. 

(ii)  Effect  of  Key  ratio  (Ri):  Since  the  sizes  of 
single-level  indices  directly  depend  on  the  num¬ 
ber  of  keys  they  have  to  point  to,  the  key  ratio 
has  an  effect  on  individual  index  sizes.  How¬ 
ever,  the  sum  of  their  sizes  remains  essentially 
unchanged.  Thus  while  the  sizes  of  single- 
level  indices  were  30864  each  when  the  ra¬ 
tio  was  1:1:1 :1,  it  changed  to  49382,  37036, 
24961 , 1 2345  respectively,  when  the  ratios  were 
changed  to  4:3:2: 1.  Observe  that  the  difference 
in  the  sum  of  sizes  in  the  two  cases  is  not  sig¬ 
nificant.  Obviously,  the  size  of  the  Global  index 
is  unaffected  by  the  ratios  since  it  has  keys  of 
all  levels.  Similarly,  this  factor  does  not  have 
significant  impact  on  the  sizes  of  Coarse  Index 
1  and  Coarse  Index  2. 

(iii)  Effect  of  Tree  order  (P):  Since  the  order  de¬ 
termines  the  maximum  number  of  elements  in 


a  node  of  an  index  tree,  the  size  of  the  tree  de¬ 
creases  with  the  increasing  order  (see  figure  8. 
While  the  decrease  is  considerable  for  smaller 
values  of  tree  order,  the  effect  is  not  so  dramatic 
at  higher  values  of  tree  order.  For  example, 
while  the  size  of  Coarse  Index  2  reduced  from 
32000  to  7 1 00  when  order  is  increased  from  2  to 
4,  it  only  reduced  from  330  to  160  when  the  or¬ 
der  is  doubled  from  50  to  100.  The  performance 
shows  similar  trends  for  all  index  types. 

(iv)  Effect  of  Number  of  Keys  (D):  The  sizes  of  all 
indices  grow  linearly  with  the  number  of  keys 
(see  figure  9).  For  example,  when  the  number 
of  keys  is  increased  from  10"*  to  10®,  the  size  of 
Coarse  Index  increased  from  20  to  2000.  Sim¬ 
ilarly,  the  size  of  Global  index  grew  from  1 235 
to  123500  for  the  same  changes  in  the  number 
of  keys. 

(v)  Effect  of  Fullness  factor  (P):  An  increase  in 
fullness  factor  implies  that  the  same  tree  size 
can  accommodate  more  number  of  keys.  How¬ 
ever,  since  B‘’"tree  requires  that  each  node  be 
at  least  half-full.  P’s  effect  on  the  size  is  not  as 
significant  as  other  factors.  However,  it  can  be 
noticed  that  the  size  decreases,  somewhat  lin¬ 
early,  with  the  fullness  factor  (see  figure  10). 
For  example,  as  the  fullness  factor  is  increased 
from  0.7  to  0.8  and  then  to  0.9,  the  size  of 
Coarse  Index  2  changed  from  2963  to  2540  and 
then  to  2222. 

(vi)  Effect  of  Query  Range  (Q):  Obviously,  there  is 
no  effect  of  query  range  on  the  index  size. 

•  Cost  of  Query  (f3):  This  metric,  which  indicates  the 
number  of  node  (or  block)  accesses  required  to  ex¬ 
ecute  a  query,  influences  the  execution  time  of  the 
query.  Obviously,  for  a  lower  turnaround  time,  we 
prefer  as  few  nodes  to  be  accessed  as  possible.  Fol¬ 
lowing  is  a  summary  of  our  analysis  illustrating  the 
effect  of  model  parameters  on  this  metric.  Due  to 
space  limitations  we  have  included  the  results  for 
query  types  U  and  TS  in  figures  11-14. 

(i)  Effect  of  Cluster  Size:  Since  the  structure  of  the 
single-level  indices  and  the  Global  index  are  un¬ 
affected  by  clustering,  the  query  cost  is  also  un¬ 
changed  under  these  cases  (figure  11).  In  the 
case  of  Coarse  Index  1 ,  given  any  query  range, 
the  index  guarantees  a  pointer  to  the  single-level 
index  that  begins  the  range.  Hence,  except  in 
some  special  cases  where  the  cluster  size  is  not 
a  multiple  of  the  node  order  (or  P  •  P),  this  met¬ 
ric  is  not  sensitive  to  the  cluster  size. 
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with  Coarse  Index  1  resulted  in  an  increase  of 
cost  from  57  to  59. 

(v)  Effect  of  Fullness  factor  (F):  As  in  the  case 
of  the  index  size,  the  fullness  factor  results  in 
increased  block  accesses  while  descending  the 
index  trees  during  searching  as  well  as  while 
performing  a  horizontal  search  at  the  leaf  nodes 
(see  figure  14).  Thus,  the  query  cost  increases 
as  the  fullness  factor  decreases.  For  example, 
when  the  Fullness  factor  is  increased  from  0.7 
to  0.9,  the  cost  of  U  query  with  Coarse  Index  2 
decreased  from  77  to  60.  Similar  changes  with 
Coarse  Index  1  resulted  in  a  decrease  of  cost 
from  83  to  65. 

(vi)  Effect  of  Query  Range  (Q):  Since  a  horizon¬ 
tal  scanning  of  nodes  involving  all  the  keys  in  a 
given  range  is  required  at  the  leaf  level  of  the  in¬ 
dices,  this  factor  has  an  effect  on  the  query  cost. 
Since  each  node  of  the  tree  holds  F  •  P  point¬ 
ers  or  10  in  our  case,  the  increase  is  logarithmic 
(with  base  10).  For  example,  when  the  query 
range  is  increased  from  100  to  10000,  the  cost 
of  U  query  with  Coarse  Index  2  increased  from 
15  to  455.  The  same  change  with  Coarse  Index 
1  resulted  in  an  increase  in  cost  from  10  to  460. 
With  Global  index,  the  cost  increased  from  17 
to  1117. 

7  Conclusion 


The  situation  is  quite  different  with  Coarse  In¬ 
dex  2.  Here,  since  the  index  guarantees  a 
pointer  only  to  the  beginning  of  a  cluster  in  the 
single-level  index,  it  is  sensitive  to  the  cluster 
range.  For  example,  if  a  cluster  with  a  size  of 
1000  starts  from  key  100  and  ends  with  key 
1099  (here,  for  simplicity  we  assumed  that  all 
keys  from  100  to  1099  exist),  then  a  query  re¬ 
quiring  the  range  of  100-5000  is  pointed  to  the 
same  block  in  the  single-level  index  as  the  one 
with  a  query  range  of  1098-6000.  Hence,  the 
second  query  has  to  search  several  additional 
nodes  at  the  leaf  of  a  single-level  index  before 
it  encounters  the  block  with  1098  from  where 
the  search  starts.  Larger  the  cluster  size,  larger 
is  such  overhead.  In  the  example  cases  that  we 
studied,  for  cluster  sizes  of  up  to  100,  the  per¬ 
formance  of  both  index  methods  was  compara¬ 
ble.  Beyond  cluster  size  of  300,  Coarse  Index  2 
becomes  expensive. 

(ii)  Effect  of  Key  Ratio  (Ri):  While  this  has  some 
effect  on  the  cost  of  single-level  index  accesses 
and  coarse  index  accesses,  it  has  no  effect  on 
the  Global  index.  The  query  cost  using  Coarse 
Index  1  is  affected  to  a  small  extent.  For  exam¬ 
ple,  when  the  key  ratio  is  changed  from  1 0: 1 : 1 : 1 
to  1 000: 1 : 1 : 1 ,  the  query  cost  for  U  with  Coarse 
Index  I  increased  from  90  to  1 15.  Similarly,  for 
TS  query,  it  increased  from  155  to  170.  For  the 
same  changes,  the  single-level  index  access  cost 
increased  from  90  to  1 15.  In  fact,  the  increase 
in  the  Coarse  Index  1  is  to  be  mainly  attributed 
to  the  increase  in  single-level  index  access  only. 
For  the  same  changes,  the  cost  with  Coarse  In¬ 
dex  2  increased  from  95  to  120. 

(iii)  Effect  of  Tree  Order  (P):  The  tree  order  has 
similar  effects  on  the  query  cost  as  it  has  on  the 
index  size — while  the  rate  of  reduction  in  the 
cost  is  exponential  at  the  smaller  values  of  P, 
it  is  not  so  large  for  larger  values  of  P  (see  fig¬ 
ure  12).  For  example,  for  a  U  query,  while  the 
cost  for  Coarse  Index  2  reduced  from  470  to  160 
when  order  is  increased  from  2  to  4,  it  only  re¬ 
duced  from  13  to  8  when  the  order  is  doubled 
from  50  to  100.  The  performance  shows  similar 
trends  for  all  index  types  and  all  query  types. 

(iv)  Effect  of  Number  of  Keys  (27):  Since  all  indices 
grow  logarithmically  with  the  number  of  keys, 
the  cost  also  grows  only  logarithmically  with 
the  number  of  keys  (see  figure  13).  For  exam¬ 
ple,  when  the  number  of  keys  is  increased  from 
10^  to  10®,  the  cost  of  U  query  with  Coarse  In¬ 
dex  2  increased  from  53  to  55.  Similar  changes 


In  this  paper,  we  have  introduced  two  types  of  coarse  in¬ 
dexing  schemes  —  Coarse  Index  1  and  Coarse  Index  2  — 
in  the  context  of  MLS  databases,  primarily  with  the  intent 
of  improving  query  response  time  and  reducing  the  size  of 
indices.  The  functionality  of  these  indices  is  explained  in 
terms  of  key  clusters  where  a  cluster  corresponds  to  a  se¬ 
quence  of  consecutive  (in  the  global  order)  keys  with  the 
same  security  level.  In  Coarse  Index  1,  there  are  pointers 
to  each  of  the  leaf  nodes  of  the  single-level  indices  cov¬ 
ering  the  keys  in  the  cluster.  On  the  other  hand,  Coarse 
Index  2  carries  one  pointer  per  cluster.  The  proposed  in¬ 
dexing  schemes  require  a  two-phase  search  technique  while 
executing  a  MLS  query.  In  the  first  phase,  a  coarse  index 
is  searched  to  determine  a  position  in  the  single-level  in¬ 
dex  (corresponding  to  one  of  the  single-level  databases). 
In  the  second  phase,  this  positional  information  is  used  to 
carry  out  search  on  the  index  (or  indices)  of  single-level 
databases. 

We  have  developed  detailed  algorithms  for  each  of  these 
schemes.  We  have  then  compared  the  performance  of 
the  proposed  indices  with  single-level  indices  as  well  as 
a  Global  multilevel  index  that  contains  keys  of  all  secu¬ 
rity  levels.  B+  trees  are  used  to  implement  indices  and 
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Figure  7.  Effect  of  Cluster  Size  on  Index  Size 


Figure  8.  Effect  of  Tree  Order  on  index  Size 
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two  performance  metrics— size  of  index  and  cost  of  query 
execution— are  defined  for  the  comparisons.  We  developed 
an  analytical  model  for  the  indices  where  each  index  is  mod¬ 
eled  by  seven  parameters— number  of  keys,  number  of  se¬ 
curity  levels,  distribution  of  keys  among  levels,  size  of  a 
query  range,  order  of  the  B+  tree,  the  fullness  factor  of  the 
B"*"  tree,  and  the  average  size  of  clusters. 

We  have  then  run  several  experiments  to  compare  the 
performance  of  the  indexing  schemes  under  different  sets  of 
parameter  values.  We  conclude  that  while  both  the  coarse 
index  methods  result  in  reduced  index  size.  Coarse  Index  2 
IS  found  to  be  much  more  effective  in  achieving  this  ob¬ 
jective.  The  cost  of  Coarse  Index  2,  however,  is  notici- 
bly  higher  in  query  executions  with  cluster  sizes  of  200  or 
higher.  The  cost  of  query  execution  with  Coarse  Index  1 
remained  unaffected  by  the  cluster  sizes.  Except  for  the 
sensitivity  to  the  cluster  size.  Coarse  Index  2  retained  its 
supremacy  over  Coarse  Index  1  and  over  the  Global  index. 
The  single-level  indices,  seem  to  have  marginal  advantage 
over  Coarse  Index  2  in  the  cost  of  query  metric. 

With  the  encouraging  results  from  this  study,  we  propose 
to  look  at  other  coarse  indexing  schemes.  Especially,  we 
propose  to  look  at  schemes  in  which  a  coarse  index  points 
to  a  security  level  that  a  cluster  belongs  to,  and  at  schemes 
in  which  the  coarse  index  points  to  some  intermediate  node 
of  a  single  index  where  a  cluster  begins. 

Finally,  we  have  assumed  that  the  coarse  index  is  trusted. 

It  would  be  interesting  to  investigate  a  kernelized  imple- 
mentation  of  the  coarse  index. 
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Introduction 

Ammann,  Jajodia,  McCollum,  and  Blaustein  define  information  warfare 
as  the  introduction  of  incorrect  data  intended  to  hinder  the  operation  of 
applications  that  depend  on  the  database  [2].  In  describing  their  approach 
to  surviving  these  kinds  of  attacks  on  databases,  imply  that  replication  is 
not  useful  in  dealing  with  information  warfare  attacks.  In  this  paper  we 
present  results  to  the  contrary,  i.e.  replication  can  be  used  (carefully)  to 
both  detect  and  siuwive  information  warfare  attacks,  on  a  practical  basis. 

McDermott  and  Groldschlag  [4,  5]  define  storage  jamming  as  *^alicious 
but  siureptitious  modification  of  stored  data,  to  reduce  its  quality.  The 
person  initiating  the  storage  jamming  does  not  receive  any  direct  benefit. 
Instead,  the  goal  is  more  indirect,  such  as  deteriorating  the  position  of  a 
competitor.”  This  is  essentially  the  same  as  information  warfare,  and  we 
adopt  the  latter  term.  To  provide  context,  Amman  et  al.  specifically  do  not 
consider  Trojan  horses  within  the  database  system  (called  internal  jam¬ 
mers  [5]),  but  instead  consider  a  wide  range  of  attacks  other  than  Trojan 
horses.  Both  groups  agree  that  Trojan  horses  are  more  effective  attackers, 
since  they  can  access  data  which  the  human  attacker  cannot.  McDermott 
et  al.  show  how  to  detect  sophisticated  attacks  by  Trojan  horses  inside  the 
database  system  but  do  not  address  recovery  or  continued  operation.  Am¬ 
man  et  al.  do  not  address  detection.  Instead,  they  show  not  only  how  to 
assess  damage  after  an  attack,  but  also  how  to  continue  operation  with 
partially  damaged  data.  This  paper  seeks  to  show  how  replication  can  be 
used  to  not  only  detect  attacks,  but  to  assess  damage  and  continue  opera¬ 
tion,  thus  surviving  information  warfare  attacks. 


*  This  work  was  supported  by  ONE,  Any  opinions,  conclusions,  or  recom¬ 
mendations  expressed  in  this  paper  are  those  of  the  authors  and  do  not 
necessarily  reflect  the  views,  policies,  or  decisions  of  the  Office  of  Naval 
Research  or  the  Department  of  Defense. 
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In  this  paper,  we  borrow  some  terms  from  Ammann  et  al.  and  refer  to 
data  damaged  by  an  attack  as  either  red  data  or  off-red  data.  Red  data  is 
unsafe  to  use;  off-red  has  been  damaged,  but  may  be  used.  Green  data  is 
valid  and  has  not  been  damaged.  We  also  use  red  and  green  to  describe 
values  that  are  to  be  stored  as  data,  by  the  attacker  and  the  defender  re¬ 
spectively. 

Replication  as  a  Defense:  Detection 

Replication  in  general  is  problematic  in  an  information  warfare  context. 
Under  many  replication  approaches,  red  data  can  be  replicated  automati¬ 
cally  and  precisely  to  many  locations.  However  replication  works  as  a 
defense  if  we  use  {one-copy  serializable)  logical  replication  over  distinct 
database  systems.  Many  replication  algorithms  copy  data  values  from  the 
source  data  item  to  its  replicas.  However,  logical  replication  copies  the 
command  that  caused  the  source  data  item  to  change.  The  command  is 
executed  at  each  replica  s  site  and,  because  of  one-copy  serializability,  re¬ 
sults  in  the  same  new  value  for  the  replica.  If  we  assume  a  distinct 
provenance  (defined  in  the  next  section)  for  the  database  system  software 
at  each  site,  then  the  Trojan  horse  will  not  be  replicated  at  all  sites.  An 
attack  must  compromise  multiple,  possibly  heterogeneous,  host  programs, 
an  unlikely  event  in  practical  systems.  Even  if  the  attackers  can  succeed  ’ 
at  every  site,  the  attack  still  may  fail.  If  the  Trojan  horses  are  not  able  to 
deliberately  malfunction  in  a  one-copy  serializable  fashion,  their  red  val¬ 
ues  will  diverge.  This  can  be  ensured  by  restricting  communication 
between  the  sites  to  just  the  protocols  needed  to  cany  out  the  authorized 
replication.  So  we  can  expect  a  scheme  using  n  replicas  to  detect  up  to  n-1 
cooperating  Trojan  horses  and  possibly  detect  an  n-Trojan  horse  attack. 

Detection  is  simple  in  the  replication  defense.  There  is  a  detection  process 
at  each  source  or  replica  site.  Following  changes  to  protected  data,  the 
process  at  the  source  site  computes  a  checksum  over  the  changed  data  and 
sends  it  to  each  replica  site,  along  with  the  identification  of  the  change. 
After  the  logical  update  is  performed  at  a  replica  site,  the  detection  proc¬ 
ess  at  the  replica  site  computes  its  own  checksum  and  compares  it  to  the 
checksum  transmitted  by  the  source  site  detection  process.  If  there  is  dis¬ 
agreement,  there  is  a  problem.  Checksums  are  not  essential  to  the 
approach  and  are  merely  used  to  facilitate  efficient  comparison.  The 
granularity  of  the  comparisons  or  checks  is  a  tradeoff  between  speed  and 
storage.  Comparisons  over  individual  data  items  allow  quicker  response  to 
attacks  but  take  more  storage  to  perform.  We  also  do  not  need  to  check 
every  change,  since  the  insertion  of  bogus  data  at  some  sites  will  ulti¬ 
mately  diverge  the  copies. 

If  we  establish  checksums  over  our  entire  database,  detection  can  be  effec¬ 
tive  against  both  external  jammers  and  internal  jammers.  (External 
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jammers  attack  files  outside  their  host  application,  e.g.  a  Trojan  horse 
hosted  by  an  Oracle  database  system  that  attacks  Mathematica  files  is  an 
external  jammer. )  Indistinguishability  [4]  comes  for  free,  without  our 
being  able  to  define  or  verify  it. 
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Table  1 .  The  refiteling  relation 
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Coke  1 

3000 
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9000 

Table  2.  The  tanker  relation 


We  use  the  aircraft  refueling  example  of  Ammann  et  al.  Suppose  we  have 
a  relational  database  with  two  relations  refueling  and  tanker.  Both  rela¬ 
tions  are  replicated  at  three  sites:  cactus,  yucca,  and  sorrel.  The 
checksums  for  refueling  and  tanker  are  Tj  and  tj,  respectively.  At  site 
yucca,  a  command  is  issued  to  “update  refueling  set  tanker  =  ‘Coke  2’ 
where  aircraft  =  ‘Sword  I*”,  but  there  is  a  Trojan  horse  in  the  database 
system  at  yucca  that  sets  tanker  =  ‘Coke  1’  where  aircraft  =  ‘Sword  2’.  The 
new  (incorrect)  checksum  for  refueling  is  Tj,  which  is  sent  to  sites  cactus 
and  sorrel,  along  with  the  command  “update  refueling  set  tanker  =  ‘Coke 
2’  where  aircraft  =  ‘Sword  1’.”  At  both  cactus  and  sorrel,  the  requested 
change  is  made  to  refueling,  but  the  detection  processes  compute  a  differ¬ 
ent  checksum  r^,  because  the  result  of  correctly  executing  the  command  is 
different.  Either  detection  process  can  now  report  a  problem  because  ^ 
Tg.  Notice  that,  in  this  example,  we  must  compute  checksums  for  relation 
tanker,  because  the  Trojan  horse  may  have  modified  the  tanker  relation 
while  performing  “update  refueling  set  tanker  =  ‘Coke  2’  where  aircraft 
-  ‘Sword  1’”  correctly.  If  the  Trojan  horse  had  been  at  site  cactus  the  at¬ 
tack  would  still  be  detected  by  the  difference  in  checksums.  (We  defer  an 
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example  of  damage  assessment  and  continued  operation  until  later  in  the 
paper.) 

It  is  important  that  the  detection  process  be  separate  from  the  database 
system.  Otherwise,  it  might  be  possible  for  the  Trojan  horse  to  send 
checksums  for  green  values  to  other  sites  while  writing  red  values  at  its 
own  site.  Furthermore,  the  detection  process  must  be  trusted,  i.e.  it  must 
be  high  assurance  software  that  is  protected  from  tampering.  Finally,  we 
are  assuming  that  the  Trojeui  horse  can  be  located  and  removed  using  ex¬ 
isting  system  administration  tools. 

Although  this  approach  is  reminiscent  of  Byzantine  generals  approaches, 
we  do  not  recommend  extending  it  to  carry  out  a  similar  automatic 
agreement  protocol.  The  foremost  reason  for  this  is  that  our  approach  is 
intended  to  work  with  shrink-wrapped  general  purpose  software.  It  is  un¬ 
likely  that  software  vendors  will  modify  their  products  to  carry  out  the 
cryptographically  protected  voting  protocols  needed  to  reach  Byzantine 
agreement.  A  less  important  reason  is  that  use  of  such  protocols  for  every 
update  would  seriously  impact  the  performance  of  most  database  systems. 
At  present  it  is  more  expedient  to  detect  the  attacks  and  then  remove  the 
Trojan  horse. 

Distinct  Provenance 


Software  that  is  created,  delivered,  installed,  and  maintained  by  distinct 
sets  of  people  has  a  distinct  provenance.  Distinction  can  be  forced  to  many 
levels  by  a  variety  of  techniques.  Since  multidatabase  techniques  allow 
replication  over  heterogeneous  systems,  the  database  systems  at  each  site 
can  be  different,  even  if  they  are  shrink-wrapped  general  purpose  soft¬ 
ware  packages.  Shrink-wrapped  general  purpose  software  packages  (e.g. 
the  database  system  software)  can  be  purchased  through  blind  buys, 
which  simulates  distinct  provenance.  Applications,  site-specific  software, 
macros,  etc.  can  be  developed  using  clean  room  techniques.  In  a  clean 
room  approach,  developers  provide  inspected  source  code  to  each  site.  The 
source  code  is  converted  to  executable  form  (e.g.  compiled  and  linked,  con¬ 
verted  to  p-code)  and  installed  at  the  operational  sites  by  personnel 
distinct  to  each  site.  Maintenance  and  administration  can  likewise  be 
separated  site-wise  by  clean  room  techniques.  Our  notion  of  distinct 
provenance  is  not  the  same  as  n-version  programming.  We  are  not  tr3dng 
to  tolerate  inadvertent  bugs  but  to  deny  an  attacker  access  to  multiple 
sites.  The  expectation  is  that  we  have  now  forced  would-be  attackers  to 


^  Introduction  of  heterogeneity  may  require  the  use  of  trusted  mapping 
functions  that  are  assured  to  map  the  logical  update  commands  in  a  way 
that  preserves  checksums. 
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compromise  multiple  host  programs  in  very  sophisticated  ways.  A  practi¬ 
cal  n-Trojan  horse  attack  can  only  succeed  if  all  n  Trojan  horses  can 
maintain  one-copy  serializability  over  all  changes  to  their  red  data  and 
internal  states.  Since  the  successful  Trojan  horses  ceinnot  be  replicas  of 
each  other,  this  is  problematic  for  the  attacker.  If  we  assume,  say  a  soft¬ 
ware  development  team,  has  m  members  who  understand  the  software^ 
then  the  n-Trojan  horse  attack  reduces  to  an  mn-person  manual  attack. 

In  theory,  a  distinct  provenance  is  possible.  In  practice,  some  software 
may  have  commonality.  Some  distinct  software  will  either  have  been  de¬ 
veloped  with  the  same  tools  or  be  based  on  the  same  packages.  This  raises 
the  question  of  Trojan-horse- writing  Trojan  horses  [3].  Fortunately,  a 
would-be  attacker  introducing  an  attack  via  widely-used  software  faces  a 
significant  problem.  The  problem  is  that  the  Trojan  horse’s  lifetime  is  now 
likely  to  be  expended  against  systems  other  than  the  target.  The  Trojan 
horse  will  trigger  on  sites  that  are  not  the  intended  target.  The  attacker 
must  now  arrange  to  turn  off  the  Trojan  horse  in  systems  that  are  not  tar¬ 
gets  or  risk  premature  discovery  of  the  attack.  Attacks  via  automatic  data 
input  systems  face  the  same  problem.  More  red  data  must  be  created,  and 
not  all  of  it  will  be  put  in  the  target  database.  This  increases  the  chance 
of  someone  detecting  the  attack  by  inspection  of  the  data. 

Manual  Attacks 

Logical  replication  is  clearly  a  problem  for  Trojan-horse-based  attacks  be¬ 
cause  those  attacks  function  by  “disobejdng”  the  commands  given  to  the 
software.  So  we  have  fhistrated  the  most  effective  means  of  attack.  But 
what  about  less  effective  manual  attacks?  Manual  attacks  are  carried  out 
by  giving  malicious  commands  to  the  database  system.  We  can  deal  with 
manual  attacks  in  one  of  two  ways:  1)  by  incorporating  an  n-person  rule 
[1],  or  2)  by  incorporating  transaction  control  expressions  [7].  An  n-person 
rule  requires  n  humans  outside  the  system  to  agree  to  a  change  to  the  da¬ 
tabase.  Transaction  control  expressions  are  a  more  general  form  of  this 
concept.  They  require  multiple  users  to  agree  to  specific  conditions  defined 
on  specific  steps  of  a  transaction.  In  either  case,  we  assume  that  data 
manipulation  commands  are  legitimate  unless  all  n  persons  can  collude. 

We  also  note  that  in  newer  automated  systems,  the  amount  of  manual  in¬ 
put  to  a  database  is  less  than  in  the  past.  For  example,  tanker  aircraft 
may  have  on-board  software  that  automatically  reports  the  amount  of  fuel 


2 

The  point  here  is  that  m  is  not  as  large  as  the  entire  team,  but  in  a  well- 
managed  properly  assured  software  development  program  greater  than 
unity.  Under  poorly-managed,  low-assurance  development,  we  are  not 
sure  any  defense  is  possible. 
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carried.  Refueling  assignments  to  aircraft  may  be  calculated  by  a  decision 
aid  program.  This  appears  to  be  just  moving  the  problem  around,  but  in 
fact  it  reduces  the  opportunities  for  manual  attacks.  Information  warfare 
attacks  via  automatic  data  input  suffer  from  the  same  weakness  as  Trojan 
horses  written  into  mass-produced  software  (see  below). 

Application  Attacks,  Interface  Attacks,  etc. 

Careful  readers  will  question  whether  application  programs  can  be  abused 
to  simulate  the  advantages  of  manual  attacks  while  avoiding  transaction 
control  expressions  or  n-person  rules.  If  an  application  outside  the  data¬ 
base  contains  the  attacking  Trojan  horse,  it  can  submit  commands  to 
insert  bogus  values  and  the  database  system  will  replicate  the  bogus 
commands  as  though  they  were  manual  commands.  Fortunately,  this  type 
of  attack  is  frustrated  by  replicating  the  application  software,  i.e.  defen¬ 
sive  replication  is  not  limited  to  database  systems.  The  same  type  of 
attack  can  be  made  via  any  software  (we  hope  not  via  hardware  or  firm¬ 
ware!)  that  lies  between  a  system’s  input  devices  and  its  output  devices. 
Careful  replication  of  these  components  will  suffice  to  detect  such  attacks 
just  as  the  basic  database  attacks  are  detected.  Our  approach  does  have 
trouble  with  the  connections  between  a  system  and  its  I/O  peripherals. 
When  we  finally  reach  the  devices  that  lie  at  the  boundaries  of  our  sys¬ 
tem,  things  become  unclear.  In  a  theoretical  sense,  we  can  define  the 
problem  away  by  saying  that  attacks  that  modify  data  as  it  is  being  put  in 
or  out  are  not  information  warfare  attacks.  In  a  practical  sense,  we  would 
have  to  limit  our  replication  to  components  that  handle  the  most  critical 
data. 

Replication  as  a  Defense.’  Damage  Assessment 

Ammann  et  al.  introduced  the  important  concept  of  damage  markings. 
Damage  markings  are  attributes  that  indicate  the  degree  of  damage  that 
has  been  assessed  upon  a  particular  data  item.  We  also  adopt  damage 
markings  and  use  their  scheme. 

Leaving  other  considerations  such  as  system  errors  aside,  when  a  check 
fails  and  we  detect  an  attack,  we  should  expect  that  either  the  source  da¬ 
tabase  system  has  been  compromised  or  the  replicas  that  failed  the  check 
have  been  compromised.  All  systems  participating  in  the  defense  should 
be  alerted.  We  assume  that  database  administrators  and  support  teams 
will  eventually  locate  the  Trojan  horse  and  remove  it.  Data  items  relating 
to  the  change  that  failed  the  check  should  all  be  marked  red,  even  though 
some  will  in  fact  he  green.  The  correct  values  can  be  determined  by  man¬ 
ual  inspection,  by  simple  majority  vote  over  all  copies  of  a  data  item. 

Correct  copies  of  the  data  are  then  marked  green.  Copier  transactions  can 
use  the  green  replicas  to  repair  damaged  copies  of  data  items. 
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Suppose  we  look  at  damage  assessment  in  our  refueling  example.  Red 
markings  on  the  copies  of  refueling  at  sites  cactus  and  sorrel  can  be 
changed  \xi  green  by  the  damage  assessment  transaction.  Note  that 
markings  for  relation  tanker  need  not  be  changed  at  any  point  during  this 
attack.  When  the  detection  processes  at  sites  cactus  and  sorrel  detect  the 
attack,  all  tuples  of  the  refueling  relation  can  be  temporarily  marked  red, 
on  the  basis  of  the  text  of  the  command  that  failed  the  checksum.  Damage 
assessment  in  this  case  can  be  accomplished  by  majority  vote,  which  al¬ 
lows  us  to  identify  the  second  tuple  of  yucca’s  copy  of  refueling  to  be 
damaged.  Tables  3  and  4  indicate  the  state  of  the  database  after  damage 
assessment,  with  red  data  shaded. 
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Table  3  .  Marking  the  damaged  refueling  relation  at  site  yucca 
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Table  4  .  Marking  the  damaged  refueling  relation  at  sites  cactus 

and  sorrel 


Replication  as  a  Defense:  Continued  Operation 

The  use  of  logical  replication  may  allow  us  to  disconnect  compromised  sys¬ 
tems  until  the  Trojan  horse  can  be  disabled.  If  an  uncompromised  site  can 
act  as  a  source  site,  it  can  take  over  from  a  compromised  source.  Replica 
sites  that  do  not  originate  data  are  also  easily  disconnected. 

A  more  complex  approach  would  logically  “disconnect”  compromised  data 
items  (e.g.  classes  or  relations) 
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Defensive  Partition 

If  the  compromised  site  is  not  the  source  of  the  data  or  there  is  an  alter¬ 
nate  source  site,  then  the  replicated  database  system  can  be  partitioned 
into  a  damaged  and  an  undamaged  component.  The  peirtition  can  take 
place  after  damage  assessment  and  could  be  decided  on  the  basis  of  an 
agreement  algorithm,  just  like  the  damage  assessment.  Any  compromised 
sites  are  placed  in  the  damaged  component.  The  sites  in  the  undamaged 
component  can  continue  to  operate  normally.  Sites  in  the  damaged  com¬ 
ponent  would  only  be  allowed  to  submit  read  requests  to  sites  in  the 
undamaged  partition. 

Single-Source  Data 

If  the  replication  is  done  with  only  one  source  site,  and  that  site  is  com¬ 
promised,  then  we  conjecture  that  we  can  still  use  a  modified  version  of 
the  continued  operation  protocol  of  Ammann  et  al.  Their  protocol  uses 
transactions  that  distinguish  between  inputs,  outputs,  pure  reads,  up¬ 
dates  (read  and  write),  and  blind  writes,  as  well  as  insert  and  delete.  Our 
modifications  for  continued  operation  under  single-source  replication  are: 

1.  We  do  not  use  the  off-green  marking.  Correct®  values  of  every 
data  item  will  be  available  for  repair  of  every  detected  attack. 

We  decide  at  database  design  time  whether  a  data  item  will  be 
marked  red  or  off-red  during  damage  assessment. 

2.  We  do  not  use  the  Coincidental  Damage  Deletion  rule  because  it 
may  allow  incorrect  deletion  of  off-red  data  that  will  ultimately 
be  marked  green  by  a  damage  assessment  algorithm'*.  We  do  use 
the  other  rules  listed  below.  Notice  that  the  remaining  rules 
have  been  modified  to  incorporate  replication. 

a.  Confinement:  A  normal  transaction  T  that  attempts  to  read, 
update,  blind  write,  or  delete  a  data  item  accesses  any  avail¬ 
able  green  copy.  If  no  green  copy  is  available,  the  normal 
transaction  attempts  to  read  an  off-red  copy.  If  no  off-red 
copy  is  available,  then  Trolls  back.  A  normal  transaction 
may  not  create  red  data. 


g 

but  not  necessarily  up-to-date 

'*  We  assume  that  the  presence  of  correct  copies  of  the  data  makes  it  likely 
that  this  repair  will  take  place  in  a  relatively  short  period  of  time. 
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b.  Propagation:  If  a  transaction  T  reads  data  marked  off -red, 
then  any  output  by  T  is  marked  off-red.  Transaction  T  may 
not  update,  blindly  write,  or  delete  data  for  which  a  green 
copy  is  available.  Transaction  T  may  not  create  off -red  data. 

c.  Coincidental  Repair  of  Off-Red  Data;  If  a  transaction  reads 
only  green  data  then  any  off-red  data  item  it  writes  blindly  is 
marked  green. 

4.  We  simplify  the  definition  of  consistency  by  leaving  out  the  acceptable 
but  not  necessarily  consistent  integrity  constraints,  giving  us  the  follow¬ 
ing  definition  of  integrity 

a.  For  each  integrity  constraint  i  €  /,  where  I  references  exclu¬ 
sively  green  data,  i  holds. 

b.  For  each  integrity  constraint  i  e  /,  where  I  references  data 
items  Xj,  ...a:„  that  are  not  green,  there  exist  values  for  jCj,  ...x„ 
such  that  i  is  satisfiable. 

4.  All  data  is  initially  marked  green.  Markings  are  changed  by  a 
damage  assessment  algorithm,  fi-om  green  to  red  or  off -red,  iff  the 
data  is  damaged.  Damage  assessment  transactions  do  not  change 
any  data,  but  correctly  identify  valid  copies  of  data  items  that  have 
damaged  replicas.  Markings  are  changed  by  a  copier  transaction 
repairing  damage,  from  red  or  off-red  to  green. 

A  normal  (i.e.  not  a  copier,  attacker,  or  damage  assessor)  transaction  T 
preserves  consistency  if,  given  a  consistent  and  all  green  state  Sj,  T  pro¬ 
duces  a  consistent  all  green  state  S^. 

We  now  pose  a  theorem  analogous  to  one  of  Ammann  et  al.,  namely 
Theorem 

Suppose  a  consistency  preserving  normal  transaction  T  follows  the  modi¬ 
fied  continued  operation  protocol  defined  above,  and  Sj  is  a  consistent 
state  of  the  possibly  damaged  database.  Then  state  the  state  resulting 
from  the  application  of  T  to  S„  is  consistent. 

Proof: 

1.  The  Confinement  rule  prohibits  transaction  T  from  accessing  any  red 
data.  Transaction  T  cannot  violate  I  by  reading  or  writing  red  data. 
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2.  The  Propagation  rule  allows  T  to  cause  green  data  to  become  off-red. 
Consider  an  integrity  constraint  i.  If  some  copy  of  a  data  item  a:  refer¬ 
enced  in  i  is  green  in  state  and  becomes  off-red  in  state  as  a  result 
of  transaction  Ts  actions,  then  there  must  be  values  for  a  copy  of  data 
Item  X  that  satisfy  i ,  that  is  value  ofx  in  S,.  (Even  though  green  repli¬ 
cas  ofx  are  often  available,  all  copies  of  x  may  be  temporarily  marked 
red  or  off-red  by  a  damage  assessment  transaction.)  A  transaction  that 
reads  off-red  data  under  the  Propagation  rule  cannot  modify  green  data 
without  causing  it  to  become  off-red.  For  this  reason,  green  data  in  the 
new  state  must  also  have  been  marked  green  in  state  S..  Integrity 
constraints  in  I  are  satisfied  in  S^. 

3.  The  Coincidental  Repair  of  Off-Red  Data  rule  allows  a  normal  transac- 
tion  T  to  change  the  marking  of  an  off-red  data  item  x  to  green  when 
voting  blindly ,  if  T  only  reads  green  data.  Since  transaction  T  is  con¬ 
sistency  preserving,  the  values  it  writes  when  changing  x  satisfy  I  in 
S^. 


To  return  to  our  running  example:  for  continued  operation  we  could  at 
&st  not  use  any  data  from  relation  refueling.  After  damage  assessment 
the  relations  would  be  marked  as  shown  by  Tables  3  and  4.  We  could  read 
and  modify  the  copies  of  refueling  at  sites  cactus  and  sorrel  even  though 
the  damage  was  not  repaired.  Ultimately,  a  copier  transaction  could  re¬ 
pair  the  “Sword  2”  tuple  of  yucca’s  copy  of  refueling,  by  copying  correct 
values  from  either  cactus  or  sottcI. 

Stored  Procedures 

Stored  procedures  are  widely  used  in  current  databases.  Their  impact  on 
storage  jamming  is  problematic.  First  of  all,  the  stored  procedure  mecha¬ 
nism  is  an  ideal  tool  for  building  efficient,  sophisticated  jammers.  Stored 
procedures  also  make  good  hiding  places.  On  the  other  hand,  all  but  the 
most  sophisticated  jamming  attacks  against  stored  procedures  are  proba¬ 
bly  too  risky  for  the  attacker.  Plausible  values  for  passive  data  items  are 
easy  to  pnerate,  either  by  arithmetic  or  by  copying  components  (e.g 
fields).  Appljhng  simple  arithmetic  to  the  text  of  a  stored  procedure  does 
not  necessarily  result  in  a  plausible,  valid  program  text.  Copying  sub¬ 
strings  of  a  program  text  into  the  target  procedure  may  result  in  a  valid 
program,  but  probably  not  a  plausible  one. 

Predictability  is  also  an  issue  for  the  attacker.  The  modified  procedure 
may  exhibit  spectacular  behavior  that  immediately  reveals  the  Trojan 
horse.  Programs  that  can  automatically  generate  valid  program  texts  that 
also  implement  specific  algorithms  are  still  in  the  research  stage  They 
are  also  relatively  large,  i.e.  on  the  order  of  general  purpose  database  sys- 
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tern  software,  so  they  would  be  difficult  for  the  attacker  to  hide.  Inserting 
bogus  code  into  multiple  stored  procedimes  could  result  in  a  combinatorial 
explosion  of  bad  data  that  would  also  reveal  the  attack.  It  is  possible  that 
future  research  in  automatic  code  generation  could  make  it  possible  to 
build  a  small  malicious  program  that  surreptitiously  modifies  stored  pro¬ 
cedures. 

Stored  procedures  require  extra  care  on  the  part  of  the  defenders.  They 
must  be  replicated,  but  with  distinct  provenance.  They  should  not  be 
automatically  copied  or  translated  to  the  various  sites,  but  should  be  re¬ 
viewed  outside  the  database  systems  and  then  installed  manually.  If 
distinct  provenance  is  maintained,  replication  should  be  an  effective 
means  of  defending  against  jamming  of  stored  procedures. 

Conclusions 

Before  presenting  our  conclusions,  we  would  like  to  discuss  some  key  as¬ 
sumptions  we  are  making,  so  the  that  the  application  of  our  results  will  be 
clear: 

1.  We  assume  that  some  malicious  software  can  be  introduced  into 
most  systems  during  their  lifetime.  We  assume  that  introducing 
specific  malicious  software  into  multiple  sites  is  problematic  and 
cannot  be  done  repeatedly  or  at  will. 

2.  We  assiune  malicious  software  or  users  can  be  removed  from  a 
system  soon  after  they  are  detected.  This  is  not  always  so  in  real 
life,  but  it  is  possible  in  systems  following  best  practice. 

3.  The  following  software  components  must  be  trusted:  the  detec¬ 
tion  process  that  computes,  compares,  and  transmits  checksums, 
any  mapping  functions  used  to  translate  logical  updates  to  site- 
specific  languages,  damage  assessment  voting  or  agreement  al¬ 
gorithms,  and  copier  transactions  used  to  repair  damage.  To 
warrant  this  trust  they  must  be  correct,  unbypassable,  and  tam¬ 
per-proof.  We  assume  sufficient  access  control,  audit,  and 
cryptographic  systems  to  make  this  be  so. 

Replication  via  logical  updates  is  a  viable  defense  that  allows  detection  of, 
damage  assessment  after  and  continued  operation  during  information 
warfare  attacks.  With  n  replicas,  logical  replication  is  effective  in  detect¬ 
ing  automatic  (Trojan  horse)  attacks  involving  less  than  mn  person 
collusion,  where  m  is  the  number  of  members  of  a  software  development 
or  maintenance  team.  With  n-person  data  entry,  logical  replication  is  ef¬ 
fective  in  detecting  manual  attacks  involving  less  than  n-person  collusion. 
With  transaction  control  expressions,  the  likelihood  of  successful  mgriual 
attack  is  even  less. 
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Detection  of  an  attack  by  logical  replication  results  in  an  undamaged  copy 
of  the  target  data,  at  either  the  source  or  the  replica  site.  A  simple  major¬ 
ity  of  undamaged  copies  is  sufficient  to  identify  the  correct  values.  Even  if 
there  is  no  majority,  possession  of  the  text  of  the  offending  command  will 
allow  (admittedly  tedious)  identification  of  the  correct  value. 

The  continued  operation  protocol  of  Amman,  Jajodia,  McCollum,  and 
Blaustein  can  be  used  to  operate  a  replicated  database  system  prior  to 
identification  of  the  correct  values.  Once  the  correct  values  have  been 
identified,  the  database  can  operate  from  either  green  copies  or  our  modi¬ 
fication  of  the  original  continued  operation  protocol.  The  existence  of 
identifiably  correct  copies  makes  it  possible  to  intentionally  partition  the 
damaged  database  system,  thus  isolating  the  offending  subsystem. 

At  present  we  are  prototyping  proof-of-concept  software  for  a  replicated 
architecture  defense.  Our  target  system  is  SQL  Server  running  on  Win¬ 
dows  NT.  Future  work  should  include  more  sophisticated  continued 
operation  protocols  that  account  for  both  communication  failures  and  site 
failures.  Specific  damage  assessment  algorithms,  accompanied  by  im¬ 
proved  damage  marking  schemes  may  be  beneficial  to  improved  recoverv 
from  attacks. 
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Abstract 

Database  systems  for  real-time  applications  must  sat¬ 
isfy  timing  constraints  associated  with  transactions.  Typ¬ 
ically  a  timing  constraint  is  expressed  in  the  form  of  a 
deadline  and  is  represented  by  a  priority  to  be  used  by 
schedulers.  In  many  real-time  applications,  security  is  an¬ 
other  important  requirement,  since  the  system  maintains 
sensitive  information  to  be  shared  by  multiple  users  with 
different  levels  of  security  clearance.  As  more  advanced 
database  systems  are  being  used  in  applications  that  need 
to  support  timeliness  while  managing  sensitive  informa¬ 
tion,  protocols  that  satiffy  both  requirements  need  to  be 
developed. 

In  this  paper,  we  propose  a  new  priority-driven  secure 
multiversion  locking( PSMVL)  protocol  for  real-time  secure 
database  systems.  The  schedules  produced  by  PSMVL 
are  proven  to  be  one-copy  serializable.  We  have  also 
shown  that  the  protocol  eliminates  covert  channels  and  en¬ 
sures  that  high  priority  transactions  are  neither  delayed 
nor  aborted  by  low  priority  transactions.  The  details  of 
the  protocol,  including  the  compatibility  matrix  and  the 
version  selection  algorithm  are  presented.  Several  exam¬ 
ples  to  illustrate  the  behavior  of  the  protocol  are  provided, 
along  with  performance  comparisons  with  other  proto¬ 
cols. 

1  Introduction 

A  multilevel  secure  database  management  system 
(MLS/DBMS)  is  a  transaction  processing  system  which  is 
shared  by  users  with  more  than  one  clearance  level  and 
which  contains  data  of  more  than  one  classification  level[8] 
[9].  In  order  to  control  all  the  accesses  to  the  database, 
mandatory  access  control(MAC)  mechanisms  are  adopted 
in  MLS/DBMS.  With  MAC  mechanisms,  the  sensitive  data 
is  protected  by  permitting  access  by  only  the  users  whose 

•This  work  was  supported  in  pan  by  HTA(lnstitute  of  Information 
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security  levels  are  higher  than  or  equal  to  the  levels  of  data. 
In  order  for  MLS/DBMS  to  be  correct,  it  has  to  meet  se¬ 
curity  requirements  in  addition  to  satisfying  logical  data 
consistency.  The  most  important  requirements  for  mul¬ 
tilevel  security  are  the  elimination  of  covert  channels  be¬ 
tween  transactions  of  different  levels  and  the  starvations  of 
high-level  transactionst8][9]. 

In  principle,  MLS  database  systems  should  be  used  for 
any  system  that  contains  sensitive  data[13}.  In  real-time 
database  management  systems(RrDBMS),  transactions 
have  explicit  timing  constraints  such  as  deadlines[12]. 
The  time  criticalness(priority)  of  a  transaction  usually  de¬ 
rives  from  both  its  timeliness  requirement  and  its  impor¬ 
tance.  RTDBMS  must  satisfy  timing  constraints  asso¬ 
ciated  with  transactions  and  maintain  data  consistency. 
There  are  increasing  needs  for  supporting  applications 
which  have  timing  constraints  while  managing  sensitive 
data  in  advanced  database  systems.  To  support  such  ap¬ 
plications,  we  must  integrate  real-time  transaction  pro¬ 
cessing  techniques  into  MLS/DBMS,  namely  MLS/RT 
DBMS[14].  Since  MLS/RT  DBMS  needs  to  support  both 
MLS  and  RT  requirements,  it  is  easy  to  see  that  proto¬ 
cols  for  MLS/RT  DBMS  could  be  more  complicated  than 
those  for  MLS/DBMS  or  RTDBMS.  There  arc  several  on¬ 
going  research  projects  on  concurrency  control  protocols 
for  RTDBMS  and  MLS/DBMS.  However,  the  protocols 
for  MLS/RT  DBMS  are  rarely  presented.  Recently,  SRT- 
2PL(Secure  Real-Time  Two  Phase  Locking)  protocol[l  1] 
for  MLS/RT  DBMS  was  proposed.  The  protocol  tried  to 
satisfy  two  requirements,  but  there  still  exists  the  priority 
inversion  problem^. 

In  this  paper,  we  propose  a  priority-driven  secure  mul¬ 
tiversion  locking  protocol,  called  PSMVL,  for  MLS/RT 
DBMSs.  The  proposed  protocol  ensures  that  high-priority 

priority  inversion  occurs  when  a  high-priority  transaction  is  de¬ 
layed  by  a  low-priority  transaction.  It  is  not  desirable  in  rcal-iinie 
database  systems. 
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ti^sactions  are  not  blocked  due  to  low-priority  transac¬ 
tions  for  timing  constraints,  while  low-level  transactions 
are  not  interfered  with  by  high-level  transactions  to  avoid 
covert  channels.  The  protocols  based  on  multiversions  re¬ 
quire  more  amount  of  storage  than  those  based  on  a  sin¬ 
gle  version.  However,  the  proposed  protocol  is  based  on 
multiversion  scheme  for  some  reasons.  First,  disk  prices 
have  come  down  dramatically,  the  disk  space  needed  to 
store  multiple  versions  is  cheaper.  Second,  the  concur¬ 
rency  control  protocol  which  maintains  a  single  version 
of  each  data  item,  such  as  2PL-HP[2]  and  OPT-wait[7] 
or  OPT-sacrifice[7],  cannot  avoid  the  starvations  of  high 
level  transactions,  because  low  level  transactions  should 
neither  be  delayed  nor  be  aborted  to  prevent  covert  chan¬ 
nels.  And  the  protocol  cannot  eliminate  the  starvations  of 
low  priority  transactions,  since  low  priority  transactions 
can  be  delayed  or  aborted  by  high  priority  transactions. 
Third,  the  protocols  which  maintain  two  versions  of  each 
of  data  items  can  partly  resolve  the  above  starvation  prob¬ 
lems.  However,  due  to  the  limited  number  of  versions, 
when  a  high  priority  transaction  at  a  high  level  conflicts 
with  a  low  priority  transaction  at  a  low  level  on  the  same 
data  item,  the  protocols  sacrifice  one  of  the  requirements. 
Therefore,  multiversion  scheme  is  considered  appropriate 
to  sausfy  all  the  requirements  for  MLS/RT  DBMSs.  In 
addiUon,  since  our  protocol  is  based  on  multiversion,  late 
transactions  can  read  the  old  version  of  each  Hata  and  thus, 
it  can  increase  the  degree  of  concurrency.  We  have  shown 
that  the  histories^  produced  by  the  protocol  are  one-coov 
serializable^. 

The  rest  of  the  paper  is  organized  as  follows.  In  Sec¬ 
tion  2,  we  present  the  security  model  of  this  paper  and  then 
introduce  the  features  of  transactions  in  RTDBMS.  In  Sec¬ 
tion  3,  we  classify  the  transactions  according  to  their  char¬ 
acteristics  to  discuss  the  conflicting  natures  of  the  require¬ 
ments,  and  present  the  PSMVL  protocol  and  the  version 
selection  algorithm.  In  Section  4,  we  present  an  example 
to  illustrate  the  behavior  of  the  protocol.  In  Section  5,  we 
prove  the  correctness  of  the  protocol  and  show  that  it  en¬ 
sures  serializability,  security  requirements,  and  no  priority 
inversion.  After  the  performance  results  of  the  protocol  are 
presented  in  Section  6,  we  conclude  the  paper  in  Section  7. 


history  indicates  the  order  in  which  the  operations  of  transactions 
arc  executed  rcladve  to  others. 

When  we  prove  the  correctness  of  a  multiversion  concurrency  control 
protocol,  we  must  show  that  the  multiversion(MV)  histories  generated  by 
the  protocol  are  one^opy  serializable.  An  MV  history  H  is  one^copy 
if  its  committed  projection,  C{H),  is  equivalent  to  a  ]>seria] 
MV  history,  where  C(H)  is  the  history  obtained  from  H  by  deleting  all 
operations  that  do  not  belong  to  committed  transactions  in  H  [6].  A  serial 
MV  history  H  is  J -serial  if  for  all  i,  j,  and  some  data  item  x,  if  Ti  reads 
X  from  Tj .  then  i  =  j.  or  Tj  is  the  last  transaction  preceding  that  writes 
into  any  version  of  x. 


2  Background 

Let  us  the  security  level  of  a  transaction  T  is  denoted  by 
LfT)  and  the  security  level  of  a  data  item  i  is  denoted  by 
L(x).  When  transactions  access  data  items,  the  following 
security  policies  are  adopted  as  ours. 

(1)  Simple  security  property  for  read  operations[4]:  A 
transaction  T  is  allowed  to  read  a  data  item  x  if  and 
only  if  LfT)  >  L(i). 

(2)  Restricted  star  ptoperty(-k-property)  for  write  opera¬ 
tions:  A  transaction  T  is  sdlowed  to  write  into  a  rfata 
item  X  if  and  only  if  L(r)  =  L(x). 

The  above  two  restrictions  are  intended  that  sensitive 
data  are  protected  by  permitting  only  the  users  whose  se¬ 
curity  levels  are  higher  than  or  equal  to  the  levels  of  data. 
In  other  words,  read/wnte  operations  at  the  same  level  and 
read  operations  at  the  lower  levei(read-down)  are  allowed. 

There  are  three  MLS  properties:  value,  delay,  and  re¬ 
covery  secure  properties[8].  To  present  the  definitions,  the 
purge  function  is  introduced.  Given  a  schedule  S  and  a  se¬ 
curity  level  SL,  purge(S,  SL)  is  the  function  that  removes 
all  operations  from  S,  whose  level  is  greater  than  SL.  For 
M  input  schedule  5,  the  output  schedule  S'  is  value  secure 
if  purge{S',SL)  is  view  equivalent  to  the  output  sched¬ 
ule  produced  for  purge{S,  SL).  For  an  input  schedule  S 
and  an  output  schedule  S',  a  schedule  is  delay  secure  if  for 
each  level  SZ,  in  5,  any  operation  o,  in  purge(S.  SL)  is  de¬ 
layed  in  the  ouq>ut  schedule  produced  for  purge(S,  SL)  if 
and  only  if  it  is  delayed  in  purge(S',  SL).  A  schedule  is 
recovery  secure  if  a  set  of  transactions,  T,  is  in  a  deadlock 
state  when  every  transaction  in  T  is  waiting  for  an  event 
that  can  only  be  caused  by  other  transactions  in  T. 

A  key  feature  of  RTDBMS  is  that  each  transaction  has 
tiimng  constraints  [1],  The  concept  of  value  Junction  is 
adopted  as  the  way  of  representing  the  timing  constraints 
of  re^-time  transactions.  For  ^h  transaction,  the  output 
of  the  corresponding  value  function  expresses  the  amount 
of  profit  that  can  be  obtained  by  the  completion  of  the 
transaction  before  its  deadline.  Since  it  is  more  advanta¬ 
geous  to  the  system  for  transactions  with  the  largest  val¬ 
ues  to  be  completed  before  their  deadlines,  high-priority 
is  given  to  the  transactions  that  have  a  large  output  value. 

At  run  time,  high-priority  transactions  should  precede  low- 
priority  transactions. 

In  this  paper,  we  adopt  the  priority  assignment  policy 
proposed  in  [15].  In  that  policy,  each  transaction  has  an 
initial  priority  and  a  start-timestamp.  The  initial  priority 
of  a  transaction  indicates  the  criticality  of  the  transaction. 
The  practical  priority  consists  of  the  initial  priority  and  the 
start-timestamp. 
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3  The  PSMVL  protocol 

In  this  section,  we  examine  the  conflicts  between  real¬ 
time  and  security  requirements,  and  then  present  the  proto¬ 
cols  and  related  rules. 

3.1  Motivations 

Let  Ti  and  Tj  be  transactions  in  a  conflicting  mode  and 
let  ?(Ti)  and  L(Ti)  be  the  priority  of  Ti  and  the  security 
level  of  Ti,  respectively.  Then,  there  are  three  possible 
cases  for  the  priorities  of  these  transactions:  (1)  PCTj)  = 
P(Tj),  (2)  PiTi)  >  P(Tj),  (3)  P(7;)  <  P(r^).  Since  (2) 
and  (3)  are  symmetric,  without  loss  of  generality  we  can 
consider  just  one,  say  (2).  Therefore,  we  can  assume  that 
P(Ti)  is  higher  or  equal  to  P(r^).  In  addition,  there  are 
three  cases  for  the  security  levels  of  these  transactions:  (1) 
L{Ti)  =  L(r,),  (2)  L(j;)  >  L(Tj),  (3)  L(Ti)  <  L(Tj), 

Let  Pmgh*  Piowf  and  Peq  be  the  priorities  and 
PHigh  >  Plow  Let  Ljiigh,  Liowf  and  Leq  be  the  se¬ 
curity  levels  and  Lnigh  >  Llow  In  Table  1,  Leq  is  used 
in  the  case  that  two  transactions  have  the  same  security 
level.  Table  1  shows  all  possible  combinations  of  priority 
and  security  level  pairs  between  Ti  and  Ty 
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not  be  blocked  by  Tj,  On  the  other  hand,  Tj  cannot  be 
blocked  by  Ti  in  order  to  avoid  covert  channels.  We  call 
this  kind  of  conflict  HH/LL-conftict.  It  is  a  conflict  between 
a  high  security  level  transaction  with  high-priority  and  a 
low  security  level  transaction  with  low-priority.  Ideally,  Ti 
should  precede  Tj  because  of  their  priorities.  We  resolve 
the  HH/LL-conflicts  by  using  the  proposed  PSMVL  proto¬ 
col. 

The  compatibility  matrix 

Like  the  multiversion  two  phase  locking{MV2PL) 
protocols],  PSMVL  has  three  types  of  locks:  read,  write, 
and  certify  locks.  The  locks  are  governed  by  the  compati¬ 
bility  matrix  in  Figure  1.  Since  no  conflict  occurs  between 
read/write  or  write/write  operations,  the  certify  locks  are 
needed  in  order  to  get  the  correct  synchronization  among 
transactions.  The  scheduler  adopting  PSMVL  protocol  ac¬ 
quires  read  and  write  locks  before  processing  read  and 
write  operations,  respectively.  When  a  transaction  has  ter¬ 
minated  and  is  about  to  conunit,  the  scheduler  converts  all 
of  the  transaction’s  write  locks  into  certify  locks. 

We  only  consider  the  cases  where  lock  requesters 
and  lock  holders  have  different  security  levels.  As  al¬ 
ready  mentioned,  Pnigh  and  Plow  are  priorities  such  that 
Pmgh  >  Plow  while  Lnigh  and  Llow  are  security  lev¬ 
els  such  that  Lffigh  >  Llow-  Let  Th{PhjLh)  be  a  lock 
holder  with  priority  Ph  and  security  level  Similarly, 
let  Tji{PR,L}i)  be  a  lock  requester  with  priority  Pr  and 
security  level  Lr.  The  conflicts  between  all  operations  of 
lower  level  transactions  and  write  or  certify  operations  of 
higher  level  transactions  cannot  occur  because  of  our  secu¬ 
rity  policy.  There  are  four  cases  based  on  both  the  priorities 
and  the  security  levels. 


In  the  first  case,  the  priorities  and  the  security  levels  of 
the  two  transactions  are  the  same.  Therefore,  the  only  con¬ 
cern  is  ensuring  serializability.  Security  requirements  and 
timing  constraints  can  be  ignored  in  this  case.  In  the  second 
and  the  third  cases,  the  priorities  of  the  two  transactions  are 
the  same.  Hence,  timing  requirements  can  be  ignored  and 
the  low  level  transaction  should  not  be  delayed  by  the  high 
level  transaction.  In  the  fourth  case,  the  security  levels  of 
the  two  transactions  are  the  same.  The  transactions  must  be 
scheduled  so  that  they  meet  the  timing  constraints  as  well 
as  the  logical  consistency.  In  this  case,  any  protocol  based 
on  the  multiversion  scheme  for  RTDBMS  can  be  used.  In 
the  fifth  case,  since  PCTj)  >  P{Tj),  Ti  must  be  followed  by 
Tj  in  order  to  prevent  priority  inversion.  In  addition,  Ti 
can  neither  be  delayed  nor  aborted  by  Tj  to  avoid  covert 
channels.  Both  requirements  can  be  satisfied  by  having  Ti 
precede  Tj, 

The  problem  occurs  in  the  sixth  case  where  P(Ti)  > 
P(Tj)  and  L(Ti)  >  UTj).  Since  P(Ti)  >  P(Tj),  Ti  should 
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Figure  1:  TTie  compatibility  matrix  for  PSMVL 
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In  Figure  1  (a),  ?(Tr)  >  PiTa)  and  UTr)  <  L(Th). 
For  priority  and  security  reasons,  Tr  cannot  be  blocked. 
Tlierefore,  Tr  should  be  aborted.  The  abortion  of  Tr 
helps  that  Tr  can  read  more  recent  data  without  violating 
any  requirements. 

In  Figure  1  (b),  P{Tr)  <  P{Tr)  and  LfTfl)  <  L{Tr). 
Under  the  BLP  model,  this  situation  can  occur  only  when 
the  operation  of  Tr  is  write  while  the  operation  of  Tr  is 
read-down.  In  this  case,  Tr  cannot  be  blocked  in  order 
to  avoid  priority  inversion  and  Tr  cannot  be  delayed  for 
security  reasons.  Tr  is  inserted  into  HH-list(z)  where  x 
is  the  data  item  7^  writes.  HH-list(i)  is  used  for  keeping 
the  orderings  of  priorities  and  is  defined  in  the  next  section 
in  detail.  In  Figure  1  (c),  P{Tr)  >  P(Tr)  and  Lfr/?)  > 
^(Tr).  In  this  case,  Tr  cannot  be  blocked  because  of  its 
priority  and  Tr  should  not  be  delayed  in  order  to  avoid 
covert  channels.  Since  Tr  and  Tr  cannot  be  blocked,  Tr 
can  share  a  lock  and  Tr  is  inserted  into  HH-list(i)  where  x 
is  the  data  item  Tr  writes.  In  Figure  1  (d),  P(Tr)  <  P(Tr) 
and  L(Tr)  >  L(Tr).  Tr  cannot  block  Tr  because  of  its 
priority  and  security  level.  Therefore,  Tr  is  blocked  until 
Tr  commits. 

33  Version  management 

Concurrency  control  protocols  based  on  multiversion 
locking  use  2PL  for  write/write  synchronization  and  ver¬ 
sion  selection  for  read/write  synchronization  [6].  When 
a  transaction  is  about  to  choose  a  version  of  a  <lata  item, 
the  most  recent  commit  version  is  commonly  used.  How¬ 
ever,  since  there  exist  HH/LL-conflicts  in  the  environment 
where  RT  and  MLS  requirements  should  be  considered  to¬ 
gether,  in  order  to  resolve  HH/LL-conflicts,  the  certify  op¬ 
eration  of  a  low  level  transaction  with  low  priority  in  (b) 
of  Figure  1  is  permitted,  and  the  read  operation  of  a  high 
level  transaction  with  high  priority  is  permitted  in  (c)  of 
Figure  1.  Therefore,  additional  rules  for  version  selection 
are  required. 

Figure  2  shows  the  data  structures  for  versions  used  in 
this  protocol.  In  an  MLS/RT  DBMS,  each  data  item  has  its 
own  security  level  and  each  transaction  has  a  priority  and 
a  security  level.  DataJtemT,  VersionT,  and  ReadjiownT 
are  data  types.  DatajtemT  is  a  data  type  for  each  of  data 
items,  while  VersionT  is  used  for  storing  the  versions  of 
each  data  item.  And  ReadjiownT  is  a  data  type  for  the 
data  items  which  are  read  by  high  level  transactions. 

In  an  MLS/RT  DBMS,  each  data  item  has  its  own  secu¬ 
rity  level  and  each  transaction  has  a  priority  and  a  security 
level.  Each  data  item  contains  two  fields:  level  and  version. 
The  level  represents  the  security  level  of  a  data  item  and  it 
must  be  trusted. 

The  version  is  the  field  for  a  version,  and  contains  a 
timestamp,  a  value,  a  hhllptr,  and  a  vlink.  The  vlink  is 
the  pointer  to  the  next  version.  The  hhllptr  is  a  pointer 


Date_iteinT; 


Field  name 

Type 

Description 

level 

level 

The  security  level  of  a  data  item 

veniofl 

VersionT 

A  version  of  a  data  item 

VersionT: 


Field  name 

Type 

Description 

timestamp 

time 

The  creadon  time  of  a  version 

value 

value 

The  value  of  a  venion. 

hhllptr 

Read.downT 

pointer 

The  pmnter  that  points  to  a  node  which  contains 

The  number  of  active  higher  security  level  trans> 
actions  that  read  down  the  version. 

vlink 

VersionT 

pointer 

The  pointer  that  points  to  the  next  version  of 
data  item. 

Rcnd.downT: 


Fieldname 

Type 

Description 

level 

level 

A  security  level  of  one  or  more  active  transactions 
that  read  down  the  version. 

count 

integer 

The  number  of  active  transactions  that  reKl  down 
the  version  for  a  given  security  level. 

clink 

IRead.downT 

pmnter 

The  pointer  that  points  to  the  next  node  typed 
Read^downTCself-referencial). 

Figure  2:  Data  structures 


that  resolves  HH/LL-conflicts  and  maintains  the  number 
of  higher  security  level  transactions  that  read  down  the  ver¬ 
sion.  Let  Tj  and  Tk  be  two  transactions  with  the  HH/LL- 
conflict  relationship  on  some  data  item  x.  Let  P(2))  > 
P(Tk)  and  LCT^)  >  L(TJfc)  =  L(x).  In  our  security  policy, 
this  situation  can  occur  when  Tj  executes  a  read  down  op¬ 
eration  while  Tk  executes  a  write  operation.  The  hhllptr 
is  the  field  of  x*  that  2}  reads,  i.e.,  x*  is  the  old  commit¬ 
ted  version  of  x  available  to  Ty  Since  the  high  priority 
transaction  Tj  can  read  old  versions  of  x,  Tj  need  not  be 
blocked  until  the  lower  level  transaction  Tk  writes.  The 
hhllptrmusi  be  trusted.  The  hhllptr  cons\s\s  of  three  fields: 
/eve/,  county  and  clink.  The  level  and  the  count  represent 
the  security  level  of  transactions  that  read  down  the  version 
in  HH/LL-conflicting  mode  and  the  number  of  the  trans¬ 
actions,  respectively.  The  clink  is  the  pointer  that  points 
to  the  next  node  for  lower  level  transactions  in  HH/LL- 
conflicting  mode. 

When  a  HH/LL  conflict  occurs,  the  following  proce¬ 
dure,  called  HH/LL-procedurCy  is  needed  for  maintaining 
the  version  information.  HH/LL-procedure  is  presented  in 
Algorithm  1. 

For  each  data  item  x,  we  maintain  a  list  of  transactions, 
denoted  by  HHdist(x)y  in  order  to  preserve  the  orderings 
of  priorities.  The  HH4ist{x)  is  a  list  of  higher  priority  and 
higher  security  level  transactions  that  are  active  when  an¬ 
other  transaction  executes  a  write  operation.  HH-list(x) 
can  be  obtained  by  a  lock  tabled  If  a  transaction  Tj  writes 

lock  table  contains  the  information  that  which  transactions  have 
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X  at  tnowf  while  another  transaction  Ti  with  higher  priority 
and  higher  level  than  Tj  is  reading  i,  then  Ti  is  inserted 
in  HH-list(x).  This  insertion  means  that  HH/LL-conflict 
can  occur  in  the  future  because  PfTj)  >  ?(Tj).  When  a 
transaction  reads  a  data  item,  HH-list  is  used  to  select  the 
appropriate  version  according  to  ite  priority  in  the  version 
selection  algorithm. 

/*  When  a  transaction  Tj  selects  the  version  xi  of  x, 
the  following  steps  are  required.  Discussion  of  the 
version  selection  algorithm  will  be  provided  in 
®  liter  section.  In  the  algorithm,  'u-’>v'  denotes 
that  V  is  a  member  of  u.  */ 

<Type  declaration> 

xi  :  Data_itemT 

new, node:  Read_downT 

FIND  :  boolean  type  whose  value  is  either  TRUE 
or  FALSE. 

if  (xi->hhllptr  is  null)  then 
create  a  node  new; 
new->count  =  1; 
new->level  =  L(Tj); 
xi->hhllptr  =  new; 

else 

node  =  xi->hhllptr; 

FIND  =  FALSE; 

while  (node  is  not  null)  do 

if  (node->level  is  L{Tj))  then 
node->count  =  node->count  +  1; 

FIND  =  TRUE; 
break  the  loop; 
end  if 
end  while 

if  (FIND  is  FALSE)  then 
create  a  node  new  ; 
new->count  =  1; 
new->level  =  L(Tj); 
append  new  to  xi->hhllptr; 
endif 
endif 

Algorithm  1:  HH/LL-prvcedure 

Let  Ti{P^,Li)  be  a  transaction  with  priority  Pi  and  se¬ 
curity  level  Li,  Assume  that  there  are  two  transactions 
andT2(P2,i2)  where  Pi  >  Pj  andii  >  ^2- 
Let  the  levels  of  two  data  items  x  and  y  be  L2.  Suppose 
that  Ti  and  T2  execute  as  shown  in  Figure  3  (a).  For  each 
data  item  x,  HH-list(x)  is  initially  empty.  At  time  2.  HH- 
list(x)  =  {Ti}.  At  time  4,  HH/LL-conflict  occurs  and  the 
versions  for  x  are  as  shown  in  Figure  3  (b).  At  time  5,  Ti 
commits  and  the  count  for  xq  becomes  0.  Then  the  version 
xo  can  be  deleted. 

Three  different  operations  can  be  performed  on  a  ver¬ 
sion:  creation,  deletion,  and  selection.  Since  there  is  no 
write/ write  conflict  in  the  PSMVL  protocol,  a  new  version 
can  be  created  without  delay.  Because  of  HH/LL-conflicts, 
two  or  more  old  versions  must  be  stored.  The  version  that 
is  older  than  the  latest  committed  version  can  be  deleted 
when  there  exist  no  high  level  transactions  that  read  that 
version.  Let  Ti{Pi^Li)  be  a  transaction  with  priority  Pj 

locks  on  some  data  items. 


T/Pf ,  Lj ):  rj  / C/  rwi ,  L,):w2lxj / cj 


(a)  All  example  execution 


(b)  The  venions  of  x 


Figure  3:  The  versions  of  data  item  x 


and  security  level  Li,  When  Ti{Pi^Li)  is  about  to  read  a 
data  item  x,  the  version  selection  algorithm  (Algorithm  2) 
selects  an  appropriate  version  of  x  for  Ti, 

/*  When  a  transaction  Ti  reads  a  data  item  x, 
this  procedure  specifies  the  steps  for 
selecting  the  right  version  of  x. 

In  the  algorithm,  'u->v'  denotes  that  v  is 
a  member  of  u.  */ 

<Type  declaration> 

X  ;  Data.itemT 

hhl Inode:  ReacLdownT 
FIND  :  boolean  type  whose  value  is 

either  TRUE  or  FALSE. 

if  Ti  has  a  write  lock  on  x,  then  Ti  must  read 

the  version  Ti  vfrites; 

else 

let  hhllnode(a  variable)  be  the  node  linked 
to  the  hhllptr  of  the  first  version  of  x. 

FIND  =  FALSE; 

for  (all  versions  of  x)  do 

if  (hhllnode  is  null)  then 

let  hhllnode  be  the  node  linked  to 
the  hhllptr  of  the  next  version  of  x. 

else 

/*  hhllnode  is  not  null,  i.e.,  some  higher 
level  transactions  already  read  down 
the  version  •/ 

for  (all  nodes  linked  to  the  hhllnode)  do 
find  a  node  such  that  the  level  of  the 
node  is  less  than  or  equal  to  L(Ti). 
if  (there  exists  such  a  node)  then 
FIND  =  TRUE; 
mark  the  version  as  xj; 
break  the  loop; 
end  if 
end  do 
end  if 
end  do 

if  (FIND  is  TRUE)  then 
return  xj; 
else 

find  the  version  such  that  it  is  the  latest 
committed  version  of  x  before  Ti  reads  the 
read  operation; 
return  the  version; 
end  if 
end  if 


Algorithm  2:  Version  selection  algorithm 
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3.4  The  protocol 

Before  we  present  our  protocol,  we  define  the  following 
timestamp  assignment  rules  for  our  protocol  that  are  used 
in  our  protocol  to  select  appropriate  versions. 

/*  Let  X  be  a  data  item  and  T.  T'  be  transactions. 

In  addition,  let  R,  W,  2ujd  C  be  read,  write,  and 
certification  operations,  respectively.  */ 
case  1:  T  requests  a  read  operation  on  x 
if  (X  is  locked  with  W  or  C)  then 

if  (any  lock-holder  has  a  higher  or  equal 
priority)  then 

if  (any  lock-holder  has  a  lower  or  equal 
security  level)  then 
block  lock-requester  ; 

end  if 

else 

/*  lock  holder  has  a  lower  priority  */ 

Let  T*  be  the  transaction  that  has 
lower  priority  than  that  of  T; 
if  (L(TM  is  lower  than  L(T))  HH/LL-procedure( ) ; 
else  if  (L(T*)  is  the  same  as  L(T))  then 
for  (all  T'  which  has  C  on  x)  do 
convert  the  lock  from  C  to  W; 
end  do 
end  if 
end  if 
end  if 

grant  a  read  lock  to  T; 

case  2;  T  requests  a  write  operation  on  x 

Let  T'  be  any  transaction  that  already  has  a  lock 
on  xdock  holder)  . 

if  (P(TM  is  higher  than  or  equal  to  P(T))  then 
if  (L{TM  >  L(T))  HH/LL-procedure() ; 
else  /*  P{T')  <  P(T)  •/ 
if  (L(T)  =  L(T'))  then 
T'  is  blocked  by  T; 
wait  ; 
end  if 
end  if 

grant  a  write  lock  to  T; 

case  3:  T  requests  a  certify  operation  on  x 

Let  T'  be  any  transaction  that  already  has  a  lock 
on  xdock  holder) ; 
if  (P(T')  >  P(T))  then 

if  (L(T')  r  L(T))  then 

the  request  is  rejected  and  blocked; 
else  if  (L(T')  >  L(T) )  HH/LL-procedure( ) ; 
end  if 

else  if  (P(T')  =  P{T))  then 
if  (L(Td  =  L(T) )  then 

if  (T*  holds  a  read  or  certify  lock)  the 
request  is  rejected; 

else  if  (L(Td  >  L(T))  HH/ LL -procedure  ()  ; 
end  if 

else  /*  p(T' )  <  p(T)  */ 

if  (L(T')  =  L(T)  then 

if  (T'  holds  a  certify  lock)  then 
for  (all  T'  which  has  C  on  x)  do 
convert  the  lock  from  C  to  W; 
end  do 
end  if 
end  if 

else  if  (L(Td  <  L(T))  HH/ LL -procedure  ( )  ; 
end  if 

grant  C  on  x; 

Algorithm  3:  The  protocol 

First,  for  each  data  item  x,  timestamp  TS(x)  is  given 
to  X  when  x  is  created.  Second,  for  a  read-only  transaction 
Tt,  the  starting  timestamp,  S.TSCTj)  is  assigned.  For  an  up¬ 
date  transaction  7),  both  the  starting  timestamp,  S.TS(Tj) 


and  the  committing  timestamp,  C^TS(Tj)  are  assigned.  We 
assume  that  the  system  guarantees  the  uniqueness  of  each 
timestamp. 

The  compatibility  matrix  shown  in  Figure  1  is  the  basis 
for  the  PSMVL  protocol.  When  transactions  have  the  same 
priority  and  the  same  security  level,  it  behaves  similarly  to 
the  MV2PL  protocol. 

3.5  The  properties  of  PSMVL  protocol 

Let  be  a  history  over  a  set  of  transactions 
{To,  Ti,  ,Tn}  produced  by  PSMVL.  Then,  H  must 
satisfy  the  following  properties.  In  order  to  list  the  prop¬ 
erties  of  histories  produced  by  executions  of  PSMVL,  we 
need  to  include  the  operation  ft  denoting  the  certification 
of  Ti. 

PS  MV  Lix  For  every  T,,  there  is  a  unique  starting  times¬ 
tamp  S.TSCr,);  that  is,  S.TSfi;)  =  S.TSCTj)  iff  i  = 

j- 

PSMV L2I  For  every  Ti,  fi  follows  all  of  Tj’s  reads  and 
wntes  and  precedes  Ti ’s  commitment. 

PSMVLzi  For  every  in  H,  if  i  ^  j,  then  cj  < 
That  is,  every  read  of>eration  reads  a  commit¬ 
ted  version. 

PSMVL4:  Let  t„ow  be  the  time  to  execute  Then 

Xj  is  either  (a)  the  most  recently  committed  version 
before  tnow  or  (b)  the  version  that  an  active  update 
transaction  Ti  whose  security  level  is  less  than  or 
equal  to  LfT*)  reads  down.  In  case  (a),  C-TSfTi) 

<  C-TSfJj)  or  S.TSdlfc)  <  C.TS(r<).  In  case  (b), 
C.TS(T<)  <  C.TS(Tj)  <  S.TS(r*)  <  C.TS(r*)  or 
C_TS(Tj)  <  S.TSfTt)  <  C-TS(r<).  That  is,  for  every 
Tklxj]  and  Wi[Xi\  in  H,  (a)  C.TS(r<)  <  C.TSfr,)  or 
(b)  S.TSiTk)  <  C-TSfTj). 

PSMVL5:  For  every  and  Wi[xi]  (i,  j,  and  k  are 
distinct),  either  /<  <  r*[x,]  or  r*[Xj]  <  /<. 

PSMVLe:  For  every  r*[xj]  and  u;j[x,]  mH,i^  j  and 
i  ^  k,  if  r*[xj]  <  fi  and  the  priority  of  T*  is  greater 
than  that  of  then  C-TSfr*)  <  TS(u;,[Xi]). 

PSMVLt:  For  every  update  transaction  Ti,  there  is 
a  unique  commit  timestamp  C.TSfTi).  That  is 
C.TS(Ti)  =  C.TS(rj)  iff  i  =  j. 

4  Examples 

In  this  section,  we  illustrate  the  operations  of  the  pro¬ 
tocol  by  showing  two  example  histories  produced  by 
PSMVL  protocol.  We  show  how  each  transaction  reads 
the  right  version  to  meet  various  requirements. 
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Ti(Pi,  L\):  ri[a;]  c\ 

T2(P2,  L2):  r2[x]  t£;2[j/]  C2 

Tz{P^,  La):  rsix]  wall]  C3 


IHH 

2 

3 

4 

5 

a 

Ti{Pi,Li) 

mmm 

B 

T2{P2,L2) 

r2[xo] 

rejected 

IjHIH 

T3(P3,L3) 

C3 

Z] 

Table  2:  The  first  example 


T\  (Pi ,  L\ ):  n  (x)  r j  [a]  a  TzlPz,  L2):  r2[x]  r2lz]  W2  [a]  C2 

TziPz,  Ls):  rafz]  rafx]  m;3[z]  ca  r4(Pi,  L4):  tt;4[x]  ca _ 


2 

3 

B 

5 

6 

7 

8 

9 

10 

11 

B 

■a 

TliPuLi) 

ri[xo] 

n(ao] 

B 

T2(P2,L2) 

mam 

rzlxo] 

r2[xo] 

^2(02] 

B 

T3(P3,L3) 

1  rzlzo] 

xafxo] 

B 

T4{P4,L4) 

ti;4[x4] 

JEM 

□ 

_ 

Table  3:  The  second  example 


Example  1  Assume  that  P\  <  P2  <  P3,  Li  >  L2  >  L3, 
L(z)  =  L3,  and  L(y)  =  L2.  The  operations  of  each  transac¬ 
tion  are  specified  as  shown  in  Table  2. 

In  this  example,  S.TS(Ti)  =  5,  C.TS(Ti)  =  6,  S.TSfTa) 
=  3,  S.TSfTs)  =  1,  and  C-TSfTs)  =  4.  Since  FfTs)  >  P(r2) 
and  LfTs)  <  L(J2),  T2  is  rejected  by  T3  at  time  4(by  the 
rule  in  Figure  1  (a)). 

Example  2  Assume  that  Pi  >  P2  >  P3  >  P4,  Li  >  L2 
>  is  >  i4.  L(2)  =  is,  L(i)  =  i4,  and  L(o)  =  ij.  The 
operations  of  each  transaction  are  specified  as  shown  in 
Table  3. 


level  tiiysamp  value  hhllptr  vlink  dmegtimp 


L4 


value 


iili]]p|r  Ylin  > 
oi] 


nil 


■cnum,  clink. 


L3  I  1  I  nU 


(a)  The  versions  of  x  (at  time  4) 


n — 1 

n 

U-J 

n  1  , 

1  — 1 

1  ^3 1 

U 

M 

^  1 

— i _ 1 _ 

tiiBesutmp  value  hhllptr 


(b)  The  versions  of  z  (at  time  9) 


level 

limesttnip  value  hhJIptr  vlink 

DincMinin  value  hhllscr  vlin! 

1  1 

M  1 

M 

°  -fii _ L. 

lewl  onmu 

>>1  Li  I  I  I  nil  I 


(c)  The  versions  of  a  (at  time  12) 


At  time  3,  HH“list(a;)  =  {J2,  Is}.  At  time  4,  since  a 
HH/LL-conflict  between  T2  and  T4  occurs,  as  shown  in 
Figure  4  (a),  xo-^hhllptr  points  to  a  new  node  which  con¬ 
tains  Lz  and  a  count  of  1.  At  time  5,  Is  reads  xq  because 
Tz  is  in  HH-list(x).  At  time  6,  HH-list(z)  =  {T2}.  At 
time  7,  Ti  reads  xq  because  xo-^hhllptr  is  not  null  and 
it  contains  lower  level  transaction  L3.  At  time  9,  since 
HH-list(z)  is  not  null,  T2  reads  zo  and  the  versions  of  z  are 
as  shown  in  Figure  4  (b).  At  time  10.  HH-list(a)  =  {Ti} 
and  at  time  1 1,  Ti  reads  ao  because  Ti  is  in  HH-list(a). 

5  Correctness  proofs 

In  this  section,  we  prove  that  the  PSMVL  protocol  guar¬ 
antees  one-copy  serializability  and  no  priority  inversion.  In 
addition,  we  show  that  it  satisfies  multilevel  security  re¬ 
quirements. 


Figure  4:  A  sequence  of  operations  for  the  second  example 

5.1  Serializability 

Theorem  1  A  multiversion  schedule,  H,  is  one-copy  seri- 
alizable(lSR)  if  and  only  ifMVSG(H,  <^)  is  acyclic  [61 

Theorem  2  Every  history  produced  by  PSMVL  is  ISR, 

Proof:  Let  Ti,  72.  •  ■  %  Tn  be  a  set  of  transactions,  and 
If  be  a  history  produced  by  PSMVL  protocol  over  Ti,  72. 
•  •  •  ,  7„.  We  will  prove  that  MVSG(If,  <)  is  acyclic  by 
showing  that  every  edge  Ti  -►  Tj  in  MVSG(if,  <)  is  in 
timestamp  order. 

We  define  a  version  order  C  by  Xt  <  Xj  only  if 
C.TS(7i)  <  C_TS(7j).  Suppose  Ti  Tj  is  an  edge  of 
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SG(H)^.  This  edge  corresponds  to  a  reads-from  relation- 
ship(i.e.,  for  some  x,  Tj  reads  x  from  T,).  Then,  by 
PSMVLs,  CJ:S(Ti)  <  S.TS(Tj)  <  C.TS(,Tj).  Let  rk[xj] 
and  be  in  H  where  i,  j,  k  are  distinct,  and  con¬ 
sider  the  version  order  edge  that  they  generate.  There  are 
two  cases;  (1)  «:  xj,  which  implies  r<  ->■  Tj  is  in 

MVSG(if,  <c):  and  (2)  Xj  <C  Xj  ,  which  implies  7* 
r<isinMVSG(7f,  <). 

Case  (1)  by  definition  of  C-TSCTj)  <  C.TS(r,). 

Case  (2)  by  PSMVL4,  either  C-TS(Ti)  <  C.TS(Tj)  or 
S.TS(r*)  <  C.TS(ri).  The  first  case  is  impossi¬ 
ble,  because  Xj  <  Xj  implies  C.TS(Tj)  <  C-TSCTj). 
Hence,  it  must  be  true  that  S.TS(r*)  <  C-TS(7’j).  We 
show  that  S.TSfTt)  <  C_TS(rj)  ensures  T*  Ti  in 
MVSG(7f,  «:).  There  are  two  possible  cases:  (1) 
P(r*)  >  p(7;)  and  (2)  Pcr*)  <  P(r<). 

In  case  ( 1 ),  T*  starts  before  Ti  commits,  and  PfT*)  >  PCTj), 
and  r*  reads  the  older  version  xj  rather  than  x < .  Therefore, 
it  ensures  that  T*  -+  T*.  In  case  (2),  T*  has  a  lower  priority 
and  a  higher  security  level  than  r<.  Thus,  if  r<  executes 
Wj[xj]  before  T*  commits,  then  2*  is  aborted  following  the 
compatibility  matrix  in  Figure  1  (a).  However,  there  ex¬ 
ists  rk[xj]  in  H .  Thus,  C-TSfr*)  <  S-TSfr^)  <  C-TS(r.) 
and  it  ensures  that  T*  ^  Ti.  Since  all  edges  in  MVSG(fr, 
<)  are  in  timestamp  order,  MVSG(7f.  <)  is  acyclic.  By 
Theorem  1,  JT  is  ISR.  □ 

5.2  'Killing  constraints 

Theorem  3  A  higher  priority  transaction  is  neither  de¬ 
layed  nor  aborted  by  low-priority  transactions  due  to  data 
contention  on  low-level  data. 

Proof:  Lei  T*  and  Tj  be  two  transactions  such  that  P(Ti) 

>  P{Tj)  where  P(Ti)  and  P(Tj)  are  the  priorities  of  7^  and 
Tj  respectively.  Let  L(j;)  be  the  security  level  of  i;.  There 
are  three  possible  cases. 

The  first  case  is  where  L(Ti)  >  L(Tj).  When  both  Ti 
and  Tj  are  about  to  access  the  same  data  item  x,  Ti  reads 
down  X  while  Tj  wntes  into  x  because  of  their  levels.  Since 
P(Ti)  >  P{Tj),  by  the  version  selection  algorithm,  Ti  reads 
X  written  by  T*  (not  Tj)  such  that  C-TS(r*)  <  SJ'S(Ti), 
Thus,  Ti  is  neither  delayed  nor  aborted  due  to  Tj.  The  sec¬ 
ond  case  is  where  L(7i)  =  L{Tj).  Because  Ti  and  Tj  have 
the  same  level,  they  should  be  scheduled  only  by  a  proto¬ 
col  for  RTDBMS  that  avoids  priority  inversion.  Therefore, 

Ti  is  not  aborted  or  delayed  by  Tj.  The  last  case  is  where 
L(Ti)  <  L{Tj).  Ti  has  a  higher  priority  than  Tj.  Hence,  if 
Ti  conflicts  with  Tj  on  the  same  data  item  x,  Tj  is  aborted 
by  Ti  using  the  compatibility  matrix  in  Figure  1  (a)  and  (d). 

®  A  serialization  graph  for  a  history  H.  SG{H).  is  a  direct  graph  whose 
nodes  are  transactions  and  whose  edges  represent  all  conflicting  relation¬ 
ships  between  two  transactions. 


For  all  possible  cases,  high-priority  transaction  Ti  precedes 
Tj.  □ 

5.3  Security  properties 

'Theorem  4  No  low-level  transaction  is  ever  delayed  or 
aborted  by  a  high-level  transaction.  In  addition,  a  low- 
level  transaction  is  not  interfered  with  due  to  data  con¬ 
tention  by  a  high-level  transaction. 

Proof:  By  the  MLS  property,  a  transaction  can  read  and 
write  data  items  at  its  own  level  and  only  read  down  data 
items  at  lower  levels.  Let  Ti  and  Tj  be  two  transactions 
such  that  L(Ti)  >  L(7j)  where  LCTj)  is  the  security  level 
of  Ti.  If  Ti  and  Tj  are  conflicting  with  each  other,  then  we 
can  see  that  7^  reads  down  the  data  item  x  while  Tj  writes 
into  X.  There  are  two  possible  cases. 

The  first  case  is  when  P(T^  <  P(Tj).  Because  L(Ti) 
is  greater  than  L(rj)  and  PCT*)  is  less  than  PiTjl  Ti  is 
aborted  or  blocked  according  to  the  compatibility  matrix  in 
Figure  1  (a)  and  (d).  Therefore,  Tj  is  neither  delayed  nor 
aborted  by  T*.  The  second  case  is  when  P(7i)  >  P(Tj).  By 
the  compatibility  matrix  in  Figure  1  (b)  and  (c),  Tj  writes 
X  without  delaying  and  HH/LL-procedure  is  performed. 
Thus,  T  is  neither  delayed  nor  aborted  by  Ti.  Since  low- 
level  transactions  are  neither  delayed  nor  aborted,  there  is 
no  security  violations.  O 

6  Performance  evaluation 

In  this  section,  we  present  the  simulation  results  to  show 
the  performance  of  PSMVL,  compared  with  two  other  con¬ 
currency  control  protocols.  The  first  protocol  we  compared 
with  PSMVL  is  the  Unconditional  Multiversion  Two  Phase 
Locking  (UMV2PL)  protocol  [10]  for  real-time  databases. 
The  UMV2PL  protocol  is  based  on  MV2PL  [6]  and  its 
compatibility  matrix  is  shown  in  Figure  5.  In  UMV2PL, 
high  priority  transactions  can  abort  low  priority  transac¬ 
tions  in  order  to  avoid  priority  inversion.  However,  when 
a  high  priority  transaction  requests  a  read  or  a  certify  lock, 
if  a  low  priority  transaction  holds  a  certify  lock,  then  the 
low  priority  transaction  can  convert  its  certify  lock  to  a 
write  lock  for  eliminating  conflicts  between  two  transac¬ 
tions.  And  thus,  the  UMV2PL  reduces  the  number  of  the 
abortions  of  low  priority  transactions. 

The  other  protocol  is  the  2PL-HP  protocol[l]  for  real¬ 
time  databases.  The  2PL-HP  protocol  is  based  on  the  2PL 
with  a  priority-based  conflict  resolution  scheme  to  elimi¬ 
nate  priority  inversion.  By  comparing  the  performance  of 
PSMVL  with  those  protocols,  the  cost  for  satisfying  secu¬ 
rity  and  timing  requirements  can  be  quantified. 

6.1  Simulation  model 

In  order  to  evaluate  the  performance  of  our  protocol, 
we  use  SLAM  II  [3]  and  adopt  the  simulation  model  as 
shown  in  Figure  6  (a).  The  parameters  used  in  the  simula¬ 
tion  study  are  presented  in  Figure  6  (b). 
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Figure  5:  The  compatibility  matrix  of  UMV2PL 
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Figure  6:  Simulation  model  and  parameters 

We  compare  PSMVL  with  UMV2PL  and  2PL-HP  in 
terms  of  the  number  of  restart  transactions,  the  average 
service  time  per  transaction,  and  the  fairness  which  shows 
how  evenly  the  missed  deadlines  are  spread  across  the  in¬ 
put  transactions  of  the  various  security  levels.  When  a 
transaction  is  generated,  it  is  delivered  to  a  transaction 
scheduler  which  assigns  a  deadline  and  priority  to  the 
transaction  as  follows.  Since  we  assume  a  soft  deadline  for 
each  transaction,  when  a  transaction  misses  its  deadline,  it 
is  not  aborted. 

Fi.  DeadLine(T)  =  ArrivalTime(T )  +  SlackTime  * 
TransactionSize(T)  *  CPUComputationTlME 

F2.  Priority(T)  =  DeadLine(T)  *  100 

To  compute  the  fairness,  for  each  security  level  t,  we 
use  the  formula, 

f,.  ‘sarXRr 

In  the  formula  F3,  MissTranSi  and  NoTransi  are  the 
number  of  transactions  at  level  i  which  miss  the  dead¬ 
lines  and  the  number  of  transactions  at  level  i,  respectively, 
whereas  MissTrans  and  NoTrans  are  the  total  number 
of  transactions  which  miss  deadlines  and  the  total  number 
of  all  input  transactions,  respectively.  If  MissTrans  -  0, 
then  we  let  Faimess(i)  be  0. 


6.2  Experimental  results 

The  results  of  our  perfonnance  analysis  are  shown  in 
Figures  7,  8,  9,  and  10. 
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(b)  Transaction  size  =  20.  Write  operation  ratio  =  0.7. 
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Figure  7:  Security  violation.  Write  operation  ratio  =  0.7 

In  Figure  7,  we  compare  the  three  protocols  PSMVL, 
UMV2PL,  and  2PL-HP,  in  terms  of  the  number  of  times 
that  low  level  transactions  are  delayed  or  aborted  by  high 
level  transactions.  The  2:-axis  represents  the  mean  in- 
teramval  time(MIAT)  which  is  the  average  time  inter¬ 
val  between  the  generations  of  transactions.  If  MIAT  is 
small,  then  transactions  are  created  more  frequently.  As 
shown  in  Figure  7,  low  level  transactions  are  never  de¬ 
layed  by  high  level  transactions  in  PSMVL.  On  the  other 
hand,  UMV2PL,  which  is  based  on  multiversion,  has  fewer 
blockings  than  2PL-HP  which  is  based  on  single  version. 
The  2PL-HP  has  the  worst  performance  because  of  its 
“wasted  restarts”.  The  transaction  length  in  Figure  7  (a) 
is  shorter  than  the  length  in  Figure  7  (b).  For  short  trans¬ 
actions,  the  number  decreases  rapidly  when  MIAT  is  in- 
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creased  gradually,  while  the  number  decreases  slowly  for 
long  transactions. 
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(b)  Transaction  size  =  20.  Write  operation  ratio  =  0.7. 


Figure  8:  Miss  percentage.  Write  operation  ratio 


one  of  the  causes  that  increase  the  number  of  restart  trans¬ 
actions.  Since  2PL-HP  uses  a  single-version,  the  number 
of  restart  transactions  is  higher  when  the  transactions  are 
scheduled  using  2PL-HP,  compared  to  UMV2PL. 
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Figure  8  shows  the  percentage  of  transactions  that  miss 
their  deadlines,  denoted  by  Miss  Percentage.  Miss  percent¬ 
age  is  calculated  with  the  following  equation:  Miss  Per¬ 
centage  =  100*  (the  number  of  tardy  jobs /the  total  num- 
ber  of  jobs).  The  number  of  Ns(No)  in  the  compatibility 
matrix  of  UMV2PL  is  more  than  that  in  the  compatibil¬ 
ity  matrix  of  PSMVL.  This  causes  UMV2PL  to  have  more 
restart  transactions  than  PSMVL.  This  is  especially  true 
for  a  high  arrival  rate,  i.e.,  when  MIAT  is  small,  PSMVL 
shows  better  performance  than  IJMV2PL.  Therefore,  a 
high  arrival  rate  increases  the  number  of  restart  transac¬ 
tions  and  results  in  high  miss  percentage. 

When  the  transactions  are  short  and  the  arrival  rate  of 
transactions  is  low,  the  miss  percentage  is  rapidly  reduced. 
However,  for  long  transactions,  the  miss  percentage  is  re¬ 
duced  more  slowly.  This  indicates  that  long  transactions  is 


Figure  9:  Average  response  time.  Write  operation  ratio  = 
0.25 


Figure  9  shows  the  average  service  time  per  transac¬ 
tion.  Let  Tj  be  a  transaction.  Then,  the  average  service 

the  total  number  of  transactions.  The  x  and  y  axes  repre¬ 
sent  interarrival  times  and  average  service  times,  respec¬ 
tively.  Since  we  assume  a  soft  deadline  for  each  trans¬ 
action,  when  a  transaction  misses  its  deadline,  it  is  not 
aborted.  Instead,  it  continues  execution  until  its  commit¬ 
ment.  Therefore,  transactions  that  miss  their  deadlines  can 
be  restarted  several  times.  As  shown  in  Figure  8,  when 
the  number  of  restart  transactions  increases,  it  takes  more 
time  to  finish  the  transactions.  PSMVL  shows  better  per- 


209 


formance  than  UMV2PL,  even  though  PSMVL  has  addi¬ 
tional  features  such  as  security  requirements.  If  the  time 
interval  between  two  transactions  is  short,  the  possibility 
of  conflicts  between  transactions  is  increased. 


Figure  10:  The  fairness  of  the  PSMVL  Mean  InterArrival 
Time  =  40 

Figure  10  shows  the  fairness  of  the  PSMVL.  In  the  fig¬ 
ure,  the  level  5  is  the  highest,  while  the  level  1  is  the  lowest. 
When  the  transaction  size  is  15,  only  the  transactions  of 
level  2  and  the  transactions  of  level  4  miss  their  deadlines. 
This  represents  that  in  the  PSMVL,  the  highest  level  trans¬ 
actions  are  not  always  sacrificed.  And  when  the  transaction 
size  becomes  smaller,  the  number  of  missed  deadline  trans¬ 
actions  decrease.  Therefore,  if  the  number  of  deadline¬ 
missing  transactions  is  small,  then  the  divisor  of  the  for¬ 
mula  (F3)  is  very  small.  As  a  result,  for  a  security  level 
2,  Faimess(i)  becomes  big  if  the  numerator  of  the  formula 
(F3)  is  not  zero.  Figure  10  also  shows  that  when  the  trans¬ 
action  size  increases,  for  each  security  level  i,  the  value 
of  Faimess(f)s  is  getting  closer.  This  means  that  the  num¬ 
ber  of  deadline-missing  transactions  is  evenly  distributed 
across  the  security  levels  and  are  not  influenced  by  the  se¬ 
curity  levels. 

7  Conclusion 

Database  systems  for  real-time  applications  must  satisfy 
timing  constraints  associated  with  transactions.  Typically 
a  timing  constraint  is  expressed  in  the  form  of  a  deadline 
and  is  represented  by  a  priority.  In  this  paper,  we  have  clas¬ 
sified  transaction  processing  systems  according  to  their  re¬ 
quirements  and  identified  the  conflicting  nature  of  security 
requirements  and  real-time  requirements.  To  address  the 
problem,  we  have  presented  a  new  priority-driven  multiver¬ 
sion  locking  protocol  for  scheduling  transactions  to  meet 
their  timing  constraints  in  real-time  secure  database  sys¬ 


tems.  The  schedules  produced  by  the  protocol  were  proven 
to  be  one-copy  serializable.  We  also  presented  our  simu¬ 
lation  model  and  evaluation  results  of  the  relative  perfor¬ 
mance  of  the  protocol,  compared  with  other  protocols. 

The  work  described  in  this  paper  can  be  extended  in 
several  ways.  First  of  all,  in  this  paper  we  have  not  con¬ 
sidered  any  trade-offs  between  real-time  requirements  and 
security  requirements.  A  trade-off  could  have  been  made 
between  those  two  conflicting  requirements,  depending  on 
the  specification  of  the  application.  For  example,  it  would 
be  interesting  to  see  how  a  policy  to  screen  out  transactions 
that  are  about  to  miss  their  decline  would  affect  perfor¬ 
mance.  Secondly,  we  have  restricted  ourselves  by  not  dis¬ 
tinguishing  temporal  and  non-temporal  data  management. 
By  exploiting  the  semantic  information  of  transactions  and 
the  type  of  data  they  access,  the  protocol  could  be  ex¬ 
tended  to  provide  a  higher  degree  of  concurrency.  Finally, 
in  this  paper,  we  have  restricted  ourselves  to  the  problem  of 
real-time  secure  concurrency  control  in  a  database  system. 
There  are  other  issues  that  need  to  be  considered  in  design¬ 
ing  a  comprehensive  MLS/RT  DBMSs,  including  architec¬ 
tural  issues,  recovery,  and  data  models.  We  have  started  to 
look  into  those  issues. 
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Abstract 

A  new  approach  is  introduced  to  evaluate  inference 
risks  in  element-level  labelling  relational  databases. 
Techniques  from  rough  set  theory  are  used  to  capture 
the  semantics  of  data  ^  and  a  quantitative  measure 
Inference  Risk  Index  (IRI)  has  been  defined  to  charac¬ 
terise  possible  inference  risks  due  to  material  implica¬ 
tions  reflected  by  the  data.  The  approach  is  shown  to 
be  able  to  take  into  account  of  all  certain  and  possible 
material  implications  in  the  data,  including  functional 
dependencies.  It  can  also  be  used  to  address  inference 
threats  posed  by  rule-induction  techniques  from  data 
mining.  A  major  advantage  of  our  approach  is  that  the 
quantitative  measure  IRI  is  computed  directly  from 
data  withoul  knowledge  input  from  System  Security 
Officer.  The  computation  is  efiftcient  and  allows  for 
real-time  monitoring  of  inference  risks  during  database 
run-time,  'riierefore,  we  are  able  to  follow  the  changes 
in  data  patterns  during  database  lifetime. 

1  Introduction 

In  multilevel  databases,  inference  has  long  been 
identified  as  a  major  threat  to  security.  An  inference 
problem  in  a  multilevel  databcise  arises  when  a  user 
with  a  low-level  clearance,  accessing  information  of 
low  classification,  is  able  to  draw  conclusions  about 
information  at  higher  classifications  [16].  Marks  [13] 
gives  a  formal  definition  of  database  inference,  Infer- 
encf  in  a  database  is  said  to  occur  if  by  retrieving  a  set 
of  tuples  {Tj  having  attributes  {A}  from  the  database, 
if  IS  possible  to  specify  a  set  of  tuples  {V},  having  at¬ 
tributes  {.4'},  where  {T]  (f  {T}  or  {A'}  ^  {A}.  In 
logic  settings,  we  say  there  exists  a  material  implica- 
tiojh  denoted  (T.A)  =>  that  relates  the  two 

sets.  In  other  studies,  material  implications  are  some¬ 
times  referred  to  as  secondary  paths  [1]  or  inference 
channels  [16]. 

*  In  this  paper,  we  use  the  term  semantics  of  data  to  mean 
the  properties  of  data  in  KRS  context,  which  is  independent 
from  the  semantics  of  application  represented  by  the  relational 
database. 


Here,  we  identify  two  types  of  inference  problems. 
One  is  due  to  classification  inconsistencies  which  arise 
from  poor  database  design.  In  this  case,  secondary 
paths  are  formed  by  doing  “joins”  on  the  base  tables  in 
the  databases  to  construct  restricted  tuples.  Consid¬ 
erable  research  has  been  done  to  address  this  problem 
(see,  for  example,  the  work  of  Hinke  [9],  Thuraising- 
ham  [23],  Binns  [1],  Burns  [3],  Garvey  et  al.  [6],  Hinke 
and  Delugach  [8],  Lin  [11],  Qian  et  al.  [16]  and  Rath 
et  al.  [17]),  The  aim  of  these  works  is  to  yield  a  well 
designed  multilevel  database  in  the  sense  that  a  user 
cannot,  through  any  series  of  database  queries,  actu¬ 
ally  derive  a  restricted  tuple. 

The  second  type  of  inference  problems  is  due  to  the 
semantics  of  application.  Information  stored  in  a  rela¬ 
tional  database  comprises  of  not  only  individual  data 
items,  but  also  the  interrelationship  among  these  data 
items.  However,  the  relational  data  model  emphasise 
efficient  data  structuring  and  manipulation.  It  lacks 
an  adequate  representation  of  data  dependencies  that 
characterise  the  application.  Thus,  multilevel  classifi¬ 
cation  of  data  items  through  element-level  labelling  is 
not  enough  to  keep  the  data  secret  if  the  semantics  of 
application  or  dependencies  among  data  are  not  taken 
into  account.  The  semantics  of  application  is  some¬ 
times  called  data  dependencies,  constraints,  or  rules 
in  different  settings. 

The  second  type  of  inference  is  harder  to  address 
since  we  need  a  way  to  find  out  and  express  the  se¬ 
mantics  of  application  in  order  to  detect,  analysis  and 
eliminate  the  inference  channels.  In  contrast,  the  ap¬ 
proaches  to  address  the  first  type  of  inference  threats 
can  be  seen  as  syntactic  in  the  sense  that  it  can  be 
done  through  schema  manipulation  without  knowing 
the  semantics  of  attributes. 

Extensive  work  has  been  done  to  address  the  sec¬ 
ond  type  of  inference.  Among  them,  some  approaches 
focus  on  eliminating  the  inference  channels  under  the 
assumption  that  data  dependencies  are  already  known 
(see,  for  example,  the  work  of  Su  and  Ozsoyoglu 
[19],  Stickel  [22]).  Others  start  from  detecting  and 


214 


analysing  inference  threats  through  various  structures. 
The  structures  used  to  capture  semantics  of  applica¬ 
tion  include  Sphere  Of  Influence  (Morgenstern  [14]), 
Semantic  Relationship  Graph  (Hinke  [9]),  Abductive 
Reasoning  (Garvey  et  al.,  [7]).  Semantic  Net  (Thu- 
raisingham  [23]),  Graph  (Garvey  et  al.  [6]),  Concep¬ 
tual  Graph  (Thuraisingham  [23],  Hinke  and  Delugach 
[8],  Delugach  and  Hinke  [5]),  Context  (Rath  et  al.. 
[17])  and  Pattern  (Marks  [13]).  (Some  of  these  tech¬ 
niques  can  also  be  used  to  address  the  first  type  of 
inferences.)  However,  available  approaches  have  the 
following  shortcomings. 

♦  To  some  extent,  all  these  approaches  except  [13] 
need  help  from  System  Security  Officer  (SSO)  to 
generate  the  desired  structures.  The  knowledge 
input  from  SSO  represents  the  semantics  of  ap¬ 
plication.  For  example,  in  [6],  the  dependency  of 
a  fiight’s  mission  on  its  cargo  has  to  be  manu¬ 
ally  input  into  DISSECT  in  order  for  DISSECT 
to  detect  the  inference  channel  between  a  flight’s 
departure  time  and  its  mission.  We  think  that 
if  there  is  a  relationship  between  a  flight’s  mis¬ 
sion  and  its  cargo,  this  dependency  must  be  re¬ 
flected  in  the  data  in  terms  of  material  implica¬ 
tion  and  we  should  be  able  to  explicate  it  from 
the  data  directly.  The  fact  that  SSO  can  never 
be  sure  he  knows  all  the  dependencies  among  data 
means  that  available  approaches  only  provide  par¬ 
tial  solutions.  At  most,  they  can  claim  that  to  the 
best  of  our  knowledge  there  is  no  inference  chan¬ 
nel.  There  might  well  be  data  dependencies  that 
the  SSO  is  unaware  of  or  are  introduced  into  the 
database  during  its  lifetime,  which  were  not  envi- 
sioned  at  database  design  time. 

•  Sometimes  inference  is  certain,  such  as  through 
functional  dependency.  However,  more  frequently 
we  have  cases  in  which  inference  is  partial  or  with 
certain  probability.  Several  previous  works  have 
addressed  this  situation,  for  example.  Morgen- 
stern  [14],  Garvey  et  al.  [7]  and  Binns  [1].  How¬ 
ever.  in  existing  approaches  the  probabilities  are 
either  assumed  ([1])  or  computed  with  the  knowl¬ 
edge  from  SSO  ([14],  [7]).  As  in  the  previous 
point,  we  argue  that  if  there  is  a  probability  as¬ 
sociated  with  an  inference  path,  this  probability 
should  be  reflected  by  the  data  through  material 
implication  and  can  be  explicated  from  the  data 
itself. 

•  More  recently,  knowledge  discovery  in  databases 
(KDD)  or  data  mining  (DM)  techniques  raise  new 
security  concerns  for  databases  (Lin  et  al.  [12]). 


One  of  the  primary  approaches  in  KDD  is  rule 
induction,  or  learning  from  examples  (see,  for 
example,  the  works  of  Hu  et  al.  [10],  Srikant 
and  Apawal  [18]).  The  derived  generalized  rules 
from  KDD  may  open  up  new  inference  channels. 
Since  the  SSO  may  not  be  aware  of  such  general¬ 
ized  rules,  previous  approaches  to  the  second  type 
of  inference  are  inadequate  in  addressing  threats 
posed  by  KDD  or  DM. 

In  this  study,  we  propose  an  approach  to  detection 
and  evaluation  of  the  second  type  of  inference  threats 
that  overcomes  the  above  deficiencies. 

2  Overview 

We  assume  the  closed  world  assumption  (CWA).  By 
CWA  we  mean  that  the  data  instances  are  complete 
and  domain  definitions  are  fully  instantiated.  Under 
CWA,  all  the  material  implications  ([13])  correspond¬ 
ing  to  possible  inferences  can  be  derived  from  data. 
This  does  not  mean  all  the  knowledge  needed  to  com¬ 
plete  an  inference  chain  has  to  reside  in  the  database, 
i.e.  the  database  doesn’t  have  to  contain  all  the  se¬ 
mantics  of  an  application.  A  chain  of  inference  can  be 
completed  using  outside  knowledge.  However,  since 
the  start  and  the  end  attribute  values  of  an  inference 
chain  are  in  the  database,  there  must  be  a  material 
implication  that  corresponds  to  that  inference  chain 
and  we  should  be  able  to  discover  that  material  impli¬ 
cation  from  data  under  CWA.  For  detailed  discussion 
of  material  implication  and  inference  chain,  please  see 
[13]. 

Unlike  previous  approaches,  here  we  are  not  trying 
to  discover  the  semantics  of  application  or  the  knowl¬ 
edge  that  can  be  used  for  possible  logical  inferences. 

As  we  have  pointed  out  before,  knowledge-based  ap^ 
proaches  may  be  incomplete  since  neither  the  database 
nor  SSO  may  have/know  all  the  semantics  of  applica¬ 
tion.  Instead  we  use  rough  set  theory  as  our  tool  to 
capture  the  semantics  of  data  and  quantify  the  in¬ 
ference  risks  through  material  implications.  Since  all 
logical  inferences  have  corresponding  material  impli¬ 
cations  in  the  database,  we  are  able  to  address  all 
the  possible  inferences  through  material  implications. 
Meanwhile,  not  all  material  implications  have  the  cor¬ 
responding  logical  inference  paths,  i.e.  there  may  not 
be  any  apparent  causal  reason  for  a  particular  material 
implication.  However,  we  think  such  material  impli¬ 
cations  are  still  a  legitimate  concern  for  the  current 
state  of  data  in  the  database.  We  will  say  more  about 
it  in  Section  5. 

Rough  set  theory  concerns  the  classificatory  analy¬ 
sis  of  imprecise,  uncertain  or  incomplete  information. 


It  is  a  very  effective  methodology  for  data  analysis  and 
discovering  rules  in  the  attribute-value  based  domains. 
It  is  also  an  efficient  tool  for  database  mining  in  re¬ 
lational  databases  (Lin,  [12]).  This  technique,  which 
is  complementary  to  statistical  methods  of  inference, 
provides  a  new  insight  into  properties  of  data.  The 
main  focus  of  this  technique  is  on  the  investigation 
of  structural  relationships  in  data  rather  than  proba¬ 
bility  distributions,  as  is  the  case  in  statistical  theory. 
One  of  the  main  advantages  of  rough  set  theory  is  that 
it  does  not  need  any  preliminary  or  additional  infor¬ 
mation  about  data,  such  as  probability  distribution 
in  statistics,  or  grade  of  membership  or  the  value  of 
possibility  in  fuzzy  set  theory. 

The  paper  is  organised  as  follows.  In  section  3,  we 
give  necessary  concepts  of  rough  set  theory.  In  sec¬ 
tion  4  our  approach  is  presented.  Discussion  of  the 
approach  follows  in  section  5.  We  draw  our  conclu¬ 
sions  in  section  6. 

3  Basic  Concepts  of  Bough  Set  Xheory 

Rough  set  theory  was  first  introduced  by  Pawlak 
[15].  The  primary  problem  addressed  by  the  technique 
of  rough  sets  is  the  discovery,  representation  and  anal¬ 
ysis  of  data  regularities.  The  rough  set-based  methods 
are  particularly  useful  for  reasoning  from  qualitative 
or  imprecise  data. 

3.1  Knowledge  Representation  System 

In  rough  set  theory,  a  Knowledge  Representation 
Syst€m(KRS)  is  a  quadruple 

S=(L/.A,V,f). 

where  [■  is  a  non-empty,  finite  set  called  universe,  A 
IS  a  finite  set  of  attributes,  V  =  UPo  is  a  union  of 
domains  of  attributes  a  belonging  to  A,  and  f  :  U  x 
.4  V’  is  an  information  function  such  that  f{x,  a)  € 

I  a  for  every  a  £  A  and  x  £  U. 

The  information  function  eissigns  attribute  values 
to  objects  belonging  to  U.  The  Knowledge  representa¬ 
tion  System  allows  for  convenient  tabular  representa¬ 
tion  of  data,  which  is  similar  to  a  relational  table  in  the 
relational  data  base  model  (cf.  Codd  [4]).  However, 
the  relational  model  is  not  interested  in  the  meaning 
of  the  information  stored  in  the  table.  The  emphasis  is 
placed  on  efficient  data  structuring  and  manipulation. 

In  the  Knowledge  Representation  System  the  attribute 
values,  i.e.,  the  table  entries,  have  associated  explicit 
meaning  as  features  or  properties  of  the  objects. 

3.2  Indiscernibility  Relation 

With  every  subset  of  attributes  P  of  A,  we  define 
an  equivalence  relation  IN D(P),  also  called  an  indts- 
cernibility  relation,  over  U  as  follows: 


G  IND(P)  iff  f(x,a)  =  f{y,a)  for  all  x,y  £  U 
and  a  £  P. 

Equivalence  relation  1ND{P)  induces  a  classification 
of  objects  into  classes,  each  of  which  consists  of  ob¬ 
jects  with  the  same  values  of  attributes  belonging  to 
P.  Let  U/ P  denote  the  family  of  equivalence  classes 
of  JND{P).  The  equivalent  classes  of  IND(P)  are 
also  called  definable  sets  or  concepts  of  universe  U  us¬ 
ing  knowledgeP.  A  KRS  is  selective  iff  all  classes  of 
U/A  sxe  one  element  set,  i.e.  IND(A)  is  an  identity 
relation. 

The  indiscernibility  relation  IND{P)  represents 
our  ability  to  classify  the  objects  in  the  universe  using 
knowledge  P.  The  central  issue  of  rough  set  approach 
is  the  determination  of  how  well  a  subset  A'  belong¬ 
ing  to  universe  U  can  be  characterised  in  terms  of  the 
information  available  to  represent  objects  of  the  uni¬ 
verse  U .  With  each  subset  X  C  U  and  a  subset  of 
attributes  P  C  A,  we  can  associate  two  subsets: 

PX  =  (J{y  e  U/P ;  y  c  a'} 

PX  =  (J{y  €  U/P  :YnX:fiO] 

called  the  P-lower  and  P-upper  approximation  of  A' 
respectively. 

Let  j:  be  in  and  P  C  A  he  our  knowledge  about 
universe  U.  We  say  that  x  is  certainly  in  X  using 
knowledge  P  iff  x  e  £A',  an^that  x  is  possibly  in 
A'  using  knowledge  P  iff  x  €  PA.  Our  terminology 
originates  from  the  fact  that  we  want  to  decide  if  x  is 
in  X  on  the  basis  of  a  definable  set  in  5  rather  than 
on  the  basis  of  A.  This  means  we  deal  with  PX  and 
PA  instead  of  A.  In  the  logic  settings,  it  is  equivalent 
to  say  that  there  are  certain  rules  that  classify  x  into 
A  using  knowledge  P  iff  x£PX,  and  that  there  are 
possible  rules  that  classify  x  into  A  using  knowledge 
PiffxGPA.  ^ 

3.3  Dependency  of  Attributes 

One  of  the  main  problems  in  the  analysis  of  KRS 
with  respect  to  discovering  cause-effect  relationships 
in  data  is  the  identification  of  dependencies  among 
different  groups  of  attributes. 

Let  C,  D  be  two  subsets  of  A  and  C  n  D  =  0, 
called  condition  and  decision  attributes,  respectively. 
We  introduce  the  notion  of  a  positive  region  of  U/D, 
POS(C,  D)  as  a  union  of  lower  approximations  of  all 
equivalence  classes  of  the  relation  IND(C): 

POS(c,  D)  =  U{c:y :  y  e  u/d]. 

The  positive  region  of  U/D  is  a  discernible  part  of 
{':  that  is,  any  object  in  POS{C,D)  can  be  uniquely 
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classified  into  one  of  the  classes  of  U/D  based  solely 
on  the  knowledge  of  C ,  i.e.  the  values  of  attributes  in 

C. 

We  say  that  the  set  of  attributes  D  depends  in  de¬ 
gree  ^(0  <  Ar  <  1)  on  the  set  of  attributes  C  in  5 
if 

A-(C,  D)  =  card(POS(C,  D))/card(U) 

where  card  is  the  cardinality  of  a  set.  The  value  k(C, 
D)  provides  a  measure  of  dependency  between  C  and 

D.  If  k  =  1,  then  the  dependency  is  full  or  functional; 
if  0  <  A  <  1,  then  there  is  partial  dependency;  if  A  = 
0,  then  the  attributes  C  and  D  are  independent.  A 
dependency  close  to  1  gives  reason  to  hypothesize  that 
generally,  there  is  a  strong  cause-effect  relationship 
between  attributes  C  and  D,  and  a  dependency  close 
to  0  suggests  weak,  if  any,  cause-effect  relationship 
between  C  and  D. 

3.4  Significance  of  Attributes 

The  relative  contribution  or  significance  of  an  in¬ 
dividual  attribute  a  belonging  to  C  with  respect  to 
the  dependency  between  C  and  D  is  represented  by 
significance  factor  SGF ,  given  by 

SGF(a,  C.  D)  =  [A(C',  D)  -  k{C  -  {a},  D)]/k(C,  D) 
if  k(C.  D)  >  0. 

Formally,  the  significance  factor  reflects  the  relative 
degree  of  decrease  of  dependency  level  between  C  and 
D  as  a  result  of  the  removal  of  the  attribute  a  from  C. 
In  practice,  the  stronger  the  influence  of  the  attribute 
a  IS  on  the  relationship  between  C  and  D,  the  higher 
the  value  of  the  significance  factor  is. 

4  A  Quantitative  Approach 

In  this  work,  we  assume  that  data  is  stored  in  a 
relauonal  database  consisting  of  a  series  of  tables  and 
sensitive  data  items  are  protected  by  element-level  la¬ 
belling.  Since  we  are  not  addressing  the  first  type  of 
inference  problem  identified  above,  we  will  further  as¬ 
sume  the  universal  relation  paradigm  [24]  and  view 
a  relational  database  as  a  single  table  containing  all 
the  attributes  from  the  entire  database.  We  are  not 
concerned  about  the  actual  mechanics  of  forming  the 
view  of  a  universal  relation.  For  the  second  type  of 
inference  problem,  we  are  only  interested  in  evaluat- 
ing  the  inference  risks  for  those  classified  data  items 
in  the  universal  relation. 

A  relational  table  can  be  considered  as  a  Knowledge 
Representation  System  in  which  columns  are  labelled 
by  attributes,  rows  are  labelled  by  the  objects  and 
the  entry  in  column  p  and  row  x  has  the  value  p(x). 
Each  row  in  the  relational  table  represents  informa¬ 
tion  about  some  object  in  universe  U.  However,  the 


relational  model  is  not  interested  in  the  meaning  of 
the  information  stored  in  the  table.  Consequently  the 
objects  about  which  information  is  contained  in  the 
table  may  not  be  represented  in  the  table.  Whereas 
in  the  KRS  all  objects  are  explicitly  represented  and 
the  attribute  values,  i.e.,  the  table  entries,  have  asso¬ 
ciated  explicit  meaning  as  features  or  properties  of  the 
objects. 

One  way  to  conciliate  the  two  models,  which  we 
adopt  in  this  paper,  is  to  take  the  primary  key  K  in 
the  relational  table  as  object  identifier  and  all  other  at¬ 
tributes  (A- A')  as  attributes  in  KRS.  In  doing  so,  we 
may  lost  some  information  contained  in  the  primary 
key  attributes  which  is  relevant  to  the  semantics  of  the 
application.  Another  way  to  get  around  this  is  to  add 
another  attribute  which  assigns  a  unique  identifier  to 
each  row  of  the  relational  table.  In  this  case,  all  the 
attributes  in  the  original  relational  table  become  at¬ 
tributes  in  the  corresponding  KRS  and  there  is  no  in¬ 
formation  loss.  However,  in  the  relational  data  model 
there  is  always  a  primary  key  which  identifies  every 
object  in  the  table,  which  means  the  derived  KRS  is 
selective.  This  situation  may  still  happen  even  if  we 
use  primary  key  as  object  identifier.  Selective  KRS 
is  not  itself  difficult  to  analyse,  but  we  might  have  to 
lower  the  degree  of  precision  in  order  to  derive  gen¬ 
eralized  rules.  We  will  say  more  about  it  with  the 
example  later. 

We  should  point  out  that  in  real  applications,  some 
data  values  of  a  relational  table  may  be  missing  or  im¬ 
precise.  Consequently,  the  derived  KRS  is  incomplete. 
For  simplicity,  in  this  paper  we  assume  that  all  data 
items  are  precisely  defined.  Interested  readers  are  re¬ 
ferred  to  [20,  21]  for  dealing  with  uncertain  data  in 
rough  set  context. 

4.1  Formal  Specification 

For  simplicity,  we  assume  a  two-level  labelling  sys- 
Classified  and  Unclassified.  Given  a  universal 
relation  R  and  its  set  of  attributes  A',  we  view  the 
relational  table  as  a  KRS  in  which  every  tuple  of  R  is 
an  object  of  universe  U  with  the  value  x  of  primary 
key  A  as  object  identifier.  The  corresponding  KRS 
has  attribute  set  A  =  A'  -  K. 

Instead  of  defining  structures  to  detect  and  analyse 
inference  risks,  we  choose  to  characterise  the  inference 
risk  for  each  subset  of  classified  attribute  values  be¬ 
longing  to  the  same  object  by  an  Inference  Risk  Index 
(IRI). 

Definition  For  a  classified  data  element  set  (j;,  P) 
in  column  set  P  and  row  x  with  value  set  Pj,,  let' at¬ 
tribute  set  B  denote  the  set  of  attributes  whose  values 
are  classified  for  object  x  (P  C  B).  Let  attribute  sets 
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C  =  A  —  B  and  D  =  P,  The  associated  Inference  Risk 
Index  IRI{x,  P)  for  data  element  set  {x,  P)  is  defined 
as 

IRI{x  P)  =  ^  [^]lND{D)) 

card{[x]ji^O{C)) 

where  card  is  the  cardinality  of  a  set,  and 

M/A’D(Z))  are  the  equivalent  classes  of  IND(C)  and 
IND(D)  containing  X,  respectively. 

Intuitively,  IRI  gives  us  a  quantitative  measure  of 
inference  risks  due  to  possible  material  implications 
existing  in  the  database.  It  is  computed  from  data 
directly. 

I  roni  the  definition  of  IRI^  it  is  easy  to  see  for  any 
cla.ssified  data  element  set  (x,P),  0  <  IRI(x,P)  <  1. 
If  IRIix.P)  =  1,  we  have  [x]/Ari>(C)  C  (i]/Ari>(r») 
whirl)  means  there  is  a  certain  rule  that  can  be  in- 
(Uirrd  from  C  to  D  for  all  the  objects/ tuples  in 
v/j.c-,.  i.e.  there  is  a  certain  material  implication 
=>  ([x]/a'D(C)»^)- 

Since  it  is  always  true  that  x  G  (M/7vd(C)  H 
^  j\t)  />.).  tiiere  is  always  a  possible  rule  that  can 
»■*  inrlmed  from  C  to  D  for  all  the  objects/tuples 
i.o.  there  is  a  possible  material  impli- 

•  t  [x]/ vDfr)- C)  =>  ([j']/A’D(r)i -C)).  Therefore, 
IRhj  Pj  IS  always  greater  than  0.  This  is  obvious 
-III  •  - •l»jeri/i uplo  X  itself  is  a  valid  instance  for  the 
/  —  //-/'  material  implication. 

.  ihcally.  if  I\D(C)  C  !ND(D),  i.e.  D  is  func- 
ti  >u  il!\  d' pendent  on  C\  I RI{x ,  P)  =  1  for  any  value 

•  J  /’  I  lii>  conforms  to  our  intuitive  notion  of  func- 
ti ‘  n.il  .i^  pendency.  Since  attribute  set  C  functionally 
d-  !•  rmmf  >  attribute  set  £).  given  values  of  C  the  at- 
T.r  k'  r  >li<)uld  be  able  to  infer  values  for  D.  Pleaise 
II- a.  that  there  are  cases  where  there  is  no  functional 
d-  p*  n.h m  y  between  C  and  Z),  but  IRI  still  equal  to 
1  In  iln-v  cases,  there  are  certain  rules  between  C 
and  J)  that  are  valid  only  for  the  specific  value  of  P^, 

11' ‘t  h.r  all  the  possible  values  of  C  and  D  in  terms  of 
ill'  ci;tv>|cal  definition  of  functional  dependency. 

I  inally.  we  should  point  out  that  our  definition  of 
HU  I-  conservative  since  IND{C)  and  IND(D)  are 
c.tlcnlatf'd  with  full  knowledge  of  relevant  data  in  the 
daialiaM-.  In  reality,  an  attack  can  only  see  part  of  the 
data  needed. 

4.2  Algorithm 

\\i  present  an  outline  of  the  process  of  comput- 
mi;  I  HI .  Optimal  algorithms  are  our  current  research 

lopie 

Slep  ] ;  Decide  attribute  sets  C  and  D. 

Siej)  ‘J;  Decide  the  most  significant  attribute  set  C,  of 
( ■ 


Step  3:  Compute  IRI  according  to  definition  using  C, 
instead  of  C. 

Step  4:  (optional)  Generalize  some  of  the  attributes 
in  C  or  D  and  repeat  the  whole  process. 

If  we  skip  Step  2,  we  are  able  to  find  all  the  ma¬ 
terial  implications  existing  in  the  data.  However,  in 
real  applications,  not  all  these  material  implications 
accurately  capture  the  dependencies  of  data.  Some 
irrelevant  attributes  may  contribute,  even  though  in 
a  negligible  way  under  CWA,  to  the  discernibility  of 
knowledge  C,  therefore,  disturb  the  real  dependency 
we  are  trying  to  express  using  IRL  The  purpose  of 
Step  2  is  to  remove  these  noises  that  may  disguise 
true  dependencies  in  the  data.  This  is  more  useful  for 
smaller  databases. 

Sometimes,  the  discernibility  due  to  attributes  of 
the  derived  KRS  is  very  high.  We  may  find  a  large 
amount  of  material  implications,  but  each  of  them 
may  just  have  a  small  number  of  valid  instances  in 
the  database.  This  might  not  be  interesting  since  we 
might  want  to  know  the  inference  risks  due  to  qual¬ 
itative  rules.  In  this  case,  we  can  try  to  generalise 
some  of  the  attributes  in  order  to  discover  and  evalu¬ 
ate  qualitative  rules. 

It  is  apparent  that  the  above  algorithm  permits  fast 
and  efficient  evaluation  of  IRL  In  fact,  one  of  the  ad¬ 
vantages  of  rough  set  theory  is  that  programs  imple¬ 
menting  its  methods  can  easily  run  on  parallel  com¬ 
puters.  Therefore  it  is  possible  to  monitor  inference 
risks  during  database  run-time  by  real-time  evalua¬ 
tion  of  IRL  In  doing  so,  we  are  also  able  to  follow  the 
changes  in  data  patterns  during  database  lifetime. 


4.3  An  Example 

EMPJD 

Grade 

Location 

Salary 

1 

2 

CA 

8000 

2 

3 

NY 

6000 

3 

5 

TX 

5000 

4 

8 

TX 

2000 

5 

2 

CA 

7000(C) 

6 

8 

CA 

2000 

7 

5 

TX 

5000(C) 

8 

5 

CA 

4000 

9 

7 

NY 

3000 

10 

7 

MA 

3000 

Figure  I:  Original  Employee  Table 
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Consider  the  multilevel  relational  table  shown  in 
Figure  1.  The  relation  can  be  seen  as  a  KRS  with 
primary  key  EMP.ID  as  object  identifier.  The  at¬ 
tributes  of  the  KRS  are  Grade,  Location  and  Salary. 
Now  we  want  to  calculate  IRI  for  object  5  and  clas¬ 
sified  value  7000  of  attribute  Salary.  First  we  use 
C  =  {Grade,  Location)  and  D  =  {Salary}  to  evalu¬ 
ate  IRI.  We  can  derive  directly  from  the  table  that 
U/C  =  {  {/,  5),  {3,  7),  {2],  {5},  {^},  {P},  {10} 

}  and  U/D  =  {  {i},  {2],  {3,  7},  {/,  6],  {5},  {8},  {P, 
10}  }  .  Therefore,  we  have  I RI(b,  {Salary})  =  0.5. 
Similarly,  we  can  find  7/27(7,  {Salary})  =  1. 

If  we  take  a  closer  look  at  the  significance  of  at¬ 
tributes  Grade  and 

Location,  we  can  find  that  SG F (Grade,  C,  D)  =  0.88 
and  SG F (Location,  C,  D)  =  0.38.  Therefore  we  may 
remove  attribute  Location  and  use  C,  =  {Grade}  to 
evaluate  IRI.  Since  U/C,  =  {  {1,5},  {2},  {3,  7,  S}, 
{4-  6).  {9,  10}  }  ,  we  have  772/(5,  {  Salary})  ^  0.5 
and  7/27(7.  {Sa/ary})  =  0.67.  These  figures  give  us  a 
more  accurate  evaluation  of  the  dependency  between 
Gradr  and  Salary. 


EMPJD 

Grade 

Location 

Salary 

I 

2 

CA 

HIGH 

3 

NY 

HIGH 

3 

5 

TX 

MIDDLE 

4 

8 

TX 

LOW 

5 

2 

CA 

HIGH(C) 

6 

8 

CA 

LOW 

7 

5 

TX 

MIDDLE  (C) 

8 

5 

CA 

MIDDLE 

9 

7 

NY 

LOW 

10 

7 

MA 

LOW 

Figure  2:  Generalized  Employee  Table 

We  can  also  generalise  some  of  the  concepts  in 
tlie  data  to  evaluate  inference  risks  due  to  qualita¬ 
tive  rules.  Suppose  salary  range  is  defined  to  be 
HIGH  =  MIDDLE  =  3500  -  5500,  and 

LOW  =  1500  —  3500.  (Generalisation  can  also  be 
done  with  attributes  in  C.)  Thus  the  original  table 
in  Figure  1  turns  into  one  shown  in  Figure  2.  It  is 
immediate  that  attribute  Grade  functionally  deter¬ 
mines  attribute  Salary.  Not  surprisingly,  we  find 
I RI(h,  {Salary})  =  1  and  7727(7,  {5a/art/})  =  1, 
\\hich  means  that  there  are  generalized  certain  rules 


that  can  be  induced  from  attribute  Grade  to  Salary. 
These  generalized  rules  are  more  likely  to  be  true  since 
they  are  based  on  larger  amount  of  valid  instances. 

5  Discussion 

As  we  have  explained  before,  the  second  type  of 
inference  threats  arise  not  from  penetration  of  the  se¬ 
curity  mechanisms  directly  but  rather  from  the  very 
nature  of  the  semantics  of  application.  In  classical 
terms,  inference  in  relational  databases  is  used  to  re¬ 
fer  to  logical  process  of  proving  or  deriving  some  clas¬ 
sified  attribute  values  for  some  tuples  from  some  un¬ 
classified  attribute  values.  In  this  sense,  inference  is 
associated  with  some  type  of  causal  relationship,  such 
as  functional  dependencies.  As  has  been  pointed  out 
by  Marks  in  [13],  material  implication  is  more  gen¬ 
eral  than  functional  dependency.  Material  implica¬ 
tions  only  require  that  sets  of  data  and  attributes 
occur  together,  regardless  of  whether  one  causes  the 
other,  both  are  caused  by  a  third  activity,  or  they  oc¬ 
cur  by  coincidence. 

In  this  study,  we  take  a  different  view  of  inference 
threats  from  that  of  Marks  [13].  We  think  that  if  a  ma¬ 
terial  implication  is  valid  for  the  data  in  a  database,  it 
should  be  considered  as  a  possible  inference  path  and 
should  be  taken  into  account  by  SSO.  There  might 
not  be  logical  reasons  for  a  particular  material  impli¬ 
cation.  However,  the  lack  of  causal  reasons  may  due 
to  the  limits  of  SSO’s  understanding  of  the  semantics 
of  application.  For  example,  in  the  case  of  scientific 
data,  causality  is  what  scientists  are  trying  to  discover 
in  the  research  process.  On  the  other  hand,  even  if 
a  material  implication  is  just  a  coincidence,  the  very 
fact  that  this  coincident  is  valid  in  the  current  state  of 
data  means  that  it  is  a  legitimate  concern.  The  ma¬ 
terial  implication  may  be  picked  up  by  an  attacker’s 
data  mining  tools  and  used  to  derive  a  valid  classified 
association  unexpectedly.  The  attacker  may  not  nec¬ 
essarily  believe  the  results,  but,  at  least,  the  validity 
of  the  material  implication  alarms  the  attacker  that 
this  could  be  true. 

We  notice  that  in  [13]  Marks  insightfully  formalises 
general  inference  as  material  implication.  He  goes 
on  using  Patterns  to  detect  certain  material  impli¬ 
cations  reflected  in  the  data,  which  correspond  to  the 
cases  where  IRI  =  I  in  our  approach.  The  rough  set 
based  approach  proposed  in  this  paper  is  able  to  cap¬ 
ture  both  certain  and  possible  material  implications 
reflected  in  the  data. 

In  addition,  our  approach  is  able  to  address  infer¬ 
ence  risks  due  to  generalized  rules  by  lowering  the  rejj- 
resentation  accuracy.  The  rough  set  approach  that  un¬ 
derlies  the  whole  technique  is  based  on  the  intuitive 
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observation  that  lowing  the  degree  of  precision  in  the 
representation  of  objects,  for  example,  by  replacing 
the  numeric  temperature  measurements  by  qualitative 
ranges  of  HIGH,  NORMAL,  or  LOW,  makes  the  data 
regularities  more  visible  and  easier  to  characterise  in 
terms  of  rules.  Lowering  the  representation  accuracy, 
however,  might  lead  to  the  undesired  loss  of  informa¬ 
tion  expressed  in  the  reduced  ability  to  discern  among 
different  concepts.  To  analyse  and  evaluate  the  effect 
of  different  representation  accuracies  on  concept  dis- 
cernibility  levels,  a  number  of  analytic  tools  have  been 
developed  [25].  By  using  these  tools,  one  can  attempt 
to  find  a  representation  method  that  would  compro¬ 
mise  between  sufficient  concept  discernibility  and  the 
ability  to  reveal  essential  data  regularities.  It  is,  in 
fact,  the  ability  of  rough  set  based  approach  to  gen¬ 
eralise  the  concepts  that  enables  it  to  deter  attempts 
to  use  generalized  rules  discovered  by  data  mining  for 
inference  [10,  18]. 

In  computing  IRI^  deciding  the  most  significant 
subset  of  C  is  a  tricky  part.  We  are  investigating  opti¬ 
mal  algorithms.  In  this  paper  we  considered  functional 
dependency  in  relational  databases  through  material 
implication.  Since  various  database  dependencies  can 
be  represented  using  indiscernibility  relations  [2],  we 
plan  to  consider  other  dependencies,  such  as  multival¬ 
ued  dependencies,  in  the  future. 

6  Conclusion 

In  this  paper,  we  proposed  a  new  approach  to  eval¬ 
uation  of  inference  risks  in  element-level  labelling  re¬ 
lational  databases.  Techniques  from  rough  set  theory 
are  used  to  capture  the  semantics  of  data  and  a  quan¬ 
titative  measure  Inference  Risk  Index  (IRI)  has  been 
defined  to  characterise  possible  inference  risks  due  to 
material  implications  reflected  by  the  data.  The  ap>- 
proach  is  shown  to  be  able  to  take  into  account  of  all 
certain  and  possible  material  implications  in  the  data, 
including  functional  dependencies.  It  can  also  be  used 
to  address  inference  threats  posed  by  rule-induction 
techniques  from  data  mining. 

A  major  advantage  of  our  approach  is  that  the 
quantitative  measure  IRI  is  computed  directly  from 
data  without  knowledge  input  from  System  Security 
Officer.  The  computation  is  efficient  and  allows  for 
real-time  monitoring  of  inference  risks  during  database 
run-time.  Therefore,  we  are  able  to  follow  the  changes 
in  data  patterns  during  database  lifetime. 
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Abstract 

How  will  assurance  and  consistency  be  attained  during  the  definition  and  usage  of  an  application’s  user- 
role  based  security  policy,  particularly  in  an  object-oriented  context  that  stresses  change  and  evolution? 
An  important  question,  especially  with  an  exploding  interest  in  designing/developing  object-oriented 
software  in  C-f-f ,  Ada95,  and  Java.  Security  concerned  users  and  organizations  must  be  provided  with 
the  means  to  protect  and  control  access  to  object-oriented  software.  Our  approach  to  user-role  based 
security  (URBS)  for  object-oriented  systems  and  applications  has  emphasized: 

•  a  customizable  public  interface  that  appears  differently  at  different  times  for  specific  users,  to 
control  and  limit  access; 

•  security  policy  specification  via  a  user-role  hierarchy  to  organize  and  assign  privileges  (public 
interface  methods)  based  on  responsibilities;  Boid, 

•  extensible/reusable  URBS  enforcement  mechanisms  that  utilize  inheritance,  generics,  and  excep¬ 
tion  handling  for  the  automatic  generation  of  code  for  the  URBS  security  policy. 

This  paper  expands  our  previous  work  to  include  assurance  and  consistency,  particularly  since  we  are 
committed  to  a  continued  exploration  of  automatically  generating  URBS  enforcement  Tnftrh«.TiiBmg 
This  paper  employs  the  field  of  software  architectures  to  explore  intriguing  solutions  that  further  our 
URBS/object-oriented  efforts. 


1  Introduction 

How  will  assurance  and  consistency  be  attained  during  the  definition  and  usage  of  an  ap¬ 
plication’s  user-role  based  security  policy,  particularly  in  an  object-oriented  context  that 
stresses  change  and  evolution? 

This  question  is  interesting,  particularly  with  the  explosive  growth  of  object-oriented  software  de¬ 
velopment.  While  C-f- h  has  been  a  strong  player  since  the  late  1980s,  Ada95  and  Java  offer  new 
opportunities  that  are  targeted  for  diverse  and  significant  market  segments.  Ada95  and  its  strong  ties 
to  DoD  and  government  software,  and  Java  with  an  increasing  impact  on  commercial  internet-based  and 
general-purpose  software,  both  expand  the  base  of  software  professionals  working  on  object-oriented 
platforms.  Security  has  been  a  paramount  concern,  especially  in  Java,  where  security  must  be  present 
to  control  the  effects  of  platform-independent  software.  Organizations  will  demand  high  consistency 
and  high  assurance  in  object-oriented  software,  across  a  wide  range  of  domains.  Health  care  systems 
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require  both  high  levels  of  consistency  and  assurance,  while  simultaneously  needing  instant  access  to 
ata  in  life-critical  situations.  In  CAD  applications,  the  most  up-to-date  specifications  on  mechanical 
parts  must  be  available  in  a  shared  manner  to  promote  cooperation  and  facilitate  productivity,  maHnp 
consistency  and  assurance  important  from  a  business  perspective. 

Unfortunately,  security  has  often  been  an  afterthought  in  the  design  and  development  process.  For 
sample,  in  databases,  it  is  often  after  relations  have  been  implemented  and  instances  populated  into 
the  database  that  the  security  issues  are  considered  by  database  programmers  writing  transactions, 
rather  th^  by  the  security  engineer,  where  the  responsibility  should  truly  rest.  It  is  our  strong  belief 
that  a  cohesive  and  comprehensive  security  policy  can  be  an  important  means  of  identifying  the  set  of 
a  owa  e  users  and  insunng  that  there  is  a  solid  definition  of  each  user’s  capabilities.  Security  must  be 
an  equal  partner  during  design  and  development,  allowing  the  impact  of  the  policy  on  all  application 
components  to  be  understood. 

Over  the  past  few  years,  we  have  concentrated  on  discretionary  access  control,  by  defining  a  user- 
role  based  security  (URBS)  model  that  can  be  utilised  in  the  design  and  development  of  object-oriented 
systems  and  applications.  The  current  public  interface  provided  by  most  object-oriented  languages  b 
the  union  of  dl  privileges  (methods)  needed  by  all  users  of  each  class.  Thb  allows  methods  intended 
or  ody  specific  users  to  be  available  to  all  users,  i.e.,  there  b  no  way  to  prevent  access  by  any  user  to 
a  method  in  the  public  interface.  For  example,  in  a  health  care  application  (HCA),  a  method  placed  in 
the  public  interface  to  allow  a  Physician  (via  a  GUI  tool)  to  prescribe  medication  on  a  patient  can’t  be 
e^  icitly  hidden  from  a  Nurse  using  the  same  GUI  tool.  Rather,  the  software  engineer  is  responsible 
for  insuring  that  such  access  does  not  occur,  since  the  object-oriented  programming  language  cannot 
nherently  enforce  the  required  security  access.  Our  approach  promotes  a  customizable  public  interface 

hbraS^mpnm'f  ^  ®  definition 

lerarchy  (URDH)  to  organize  responsibilities  and  to  establbh  privUeges.  Privileges  can  be  assigned 

^  set  of  application  methods)  or  prohibited  (cannot  invoke  a  set  of  application  methods)  to 
roles.  Our  recent  efforts  have  proposed  extensible  and  reusable  URBS  enforcement  mechanisms  that 

“«P«on  handling  for  the  automatic  generation  of  code  from  the 
Tmp?K  u'  has  been  to  minimbe  the  amount  of  knowledge  a  software  engineer  must  have  on 

f  having  mechanisi^  that  are  self-contained,  class  libraries,  which  supply  all  of  the  required 

and  URB?".  “1  oTwe'  bti  object-oriented  design  model  [5,  11] 

This  paper  expands  our  previous  work  to  include  assurance  and  consistency,  particularly  since  we 
we  con^tted  to  a  continued  exploration  of  automatically  generating  URBS  enforcement  mechanisms. 

e  believe  that  class  libraries  may  not  offer  a  secure  enough  venue  to  insure  high  consistency  and 
Msurance  for  enforcement  mechanisms.  Thus,  we  have  turned  to  the  field  of  software  architectures 
o  investigate  potential  solutions  to  augment  our  previous  URBS  enforcement  approaches  [2,  3,  41. 
o  ware  architectures  [16]  eipand  traditional  software  engineering  by  looking  at  how  different  major 
ys  em  components  can  mesh  and  interact.  This  b  especially  relevant  for  object-oriented  software 
where  a  class  library  for  a  problem  is  initially  developed,  with  software  engineers  designing  and  building 

iflnf  r  implement  the  overall  capabilities  of  an  application.  In  such  a  model, 

the  URBS  enforcement  mechanism  must  interact  with  both  the  class  library  and  the  toob  for  an 
application,  to  insure  that  users  utilizing  tools  only  access  those  portions  of  the  application  on  which 
they  have  been  granted  access.  In  our  approach,  this  translates  to  the  users  only  being  able  to  invoke 
methods  that  have  been  authorized  to  their  respective  roles. 

of  two  paper  contains  five  sections.  In  Section  2,  we  provide  background  on  the 
ADAM  environment  [5].  In  Section  3,  we  discuss  the  critical  need  of  consistency  for  security,  as  we  seek 
to  guarantee  a  level  of  assurance  to  designers  and  users  utilizing  an  URBS/object-oriented  approach 
In  Section  4  we  briefly  review  two  of  our  previous  URBS  enforcement  approaches,  propose  and  explore 
software  architectural  variants  that  can  offer  varying  degrees  of  assurance  and  consistency,  and  critique 
the  vMiants  by  comparmg  and  contrasting  their  capabilities  from  multiple  perspectives.  Section  5 
ex^nes  related  work  in  URBS  for  object-oriented  systems,  looking  at  their  support  for  consbtency 
and  assurance.  Section  6  concludes  this  paper. 
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2  Background:  The  ADAM  Environment 

ADAM  [5,  11]  is  an  integrated  system  for  language-independent,  object-oriented  desig;n  environment 
for  applications  that  have  software  engineering,  database,  and  security  requirements.  ADAM,  short  for 
Active  Design  and  Analyses  Modeling,  can  generate  compilable  code  in  C++  (GNU  C++  and  Ontos  C++ 
-  an  object-oriented  database  system),  Ada83,  Ada95,  and  Eiffel  for  any  object-oriented  design.  An 
integral  part  of  this  process  is  the  definition  of  URBS  [9,  10].  Both  Unix  and  PC  versions  of  ADAM 
are  available.  As  an  example,  a  health  care  application  (HCA)  is  employed  [9],  as  shown  in  Figure  1. 


Figure  1:  Sample  Object  Types  for  HCA. 


Figure  2:  Sample  Object,  Attribute,  and  Method  Profiles  for  HCA. 


To  track  the  purpose  and  intent  of  different  design  choices  and  constructs  in  an  application,  profiles 
are  utilized  [5,  9].  Profiles  force  software  engineers  to  supply  detmled  information  as  an  application  is 
designed,  and  are  used  to  facilitate  on-demand  emd  automatic  feedback  via  analyses  to  alert  designers 
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whenever  an  action  in  the  environment  results  in  a  conflict  or  possible  inconsistency.  The  object-type 
profile  for  Prescription,  the  attribute  profile  for  Cost,  and  the  method  profile  for  UpdateCost  are 
shown  in  Figure  2.  Other  profiles  have  been  omitted  for  brevity. 


Figure  3:  The  URDH  of  the  EGA. 


•  iiser-role  definition  hierarchy  (URDH)  characterizes  the  different  kinds  of 

indiyiduab  (and  groups)  who  all  require  different  levels  of  access  to  an  application.  Figure  3  shows  a 
partial  URDH  in  ADAM  for  HCA.  User  roles  (UR)  (e.g..  Stall  JUT,  Education,  etc.),  can  be  grouped 
under  a  single  user  type  (UT)  (e.g.,  Hurse).  When  multiple  UTs  share  privileges,  a  riser  class  (UC) 
can  be  defined  (e.g.,  Medical^tall).  To  define,  UCs,  UTs,  and  URs,  we  utilize  a  node  profile  (NP): 
1.  a  name  for  the  node;  2.  a  prose  description  of  its  responsibility;  3.  a  set  of  assigned  methods  (the 
positive  privileges);  4.  a  set  of  prohibited  methods  (the  negative  privileges);  and  5.  a  set  of  consistency 
criteria  for  relating  URDH  nodes. 

A  node  description  for  a  UT  in  Figure  3  is:  tvxse:  Direct  involvement  with  patient  care  on  a 
daily  basis.  In  addition,  for  each  UR,  the  role-security  requirements  are  defined,  e.g.,  Stall^:AH 
clinical  information  for  the  patients  that  they  are  responsible  for.  Can  write/modify  poHions  of  clinical 
information  to  frock  patient  progress.  Cannot  change  a  Physician’s  orders  on  a  patient.  To  establish 
privileges,  application  methods  are  assigned  to  URDH  nodes.  Commonalities  (shared  methods)  can 
be  moved  from  a  set  of  URs  to  their  shared  UT,  and  from  a  set  of  UTs  to  their  shared  UC.  Methods 
shared  by  all  UTs  can  be  moved  up  to  Users.  Thus,  commonalities  flow  up  the  URDH,  while  differences 
flow  down.  Prohibited  methods  are  used  to  explicitly  identify  which  methods  cannot  be  accessed  by 
a  URDH  node.  Equivalence  (subsumption)  criteria  allow  the  security  engineer  to  identify  which  URs 
(UTs/UCs)  must  have  the  same  (subsumable)  capabilities,  as  reflected  in  the  assigned/prohibited 
methods.  Conflicts  between  assigned  and  prohibited  methods  or  violations  of  consistency  criteria  are 
flagged  by  ADAM. 

The  final  step  in  ADAM  involves  user  authorization,  which  is  accomplished  via  a  riser  profile  (UP): 
1.  a  name  for  the  user;  2.  a  prose  description  of  its  responsibility;  3.  a  prose  description  of  its  security 
requirements;  4.  a  set  of  assigned  URs  (the  positive  privileges);  5.  a  set  of  prohibited  roles  (the  negative 
privileges);  and  6.  a  set  of  criteria  for  relating  users.  Note  that  conceptually,  a  user  has  privileges  via 
a  set  of  one  or  more  assigned  URs.  So,  the  URDH  information  is  aggregated  for  each  UR  for  which  a 
user  has  been  authorized  (or  prohibited). 
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3  The  Need  for  Assurance,  Consistency,  and  Analyses 

Role-based  security  policies  and  enforcement  mechanisms  must  have  high  consistency  in  order  to  sup¬ 
port  a  high  assurance,  secure  system.  The  consistency  must  be  maintained  at  all  levels  within  the 
policy,  including  individual  roles,  role  hierarchies,  and  end-user  authorizations,  to  insure  that  their 
creation,  modification,  and  deletion  will  always  maintain  the  required  URBS  policy.  Consistency  is 
the  foundation  upon  which  high  integrity  and  assured  secure  systems  must  be  built.  A  set  of  tech¬ 
niques/tools  must  be  provided  that  allow  URBS  policies  to  be  analyzed  and  assured  at  all  times  during 
design,  development,  and  maintenance  of  object-oriented  software. 

In  general,  URBS  policies  are  application  dependent,  and  consequently,  data  security  requirements 
vary  widely  from  application  to  application.  For  example,  sensitive  health  care  data  must  be  both  pro¬ 
tected  from  unauthorized  use  while  simultaneously  be  almost  instantaneously  available  in  emergency 
and  life  critical  situations.  On  the  other  hand,  in  some  design  environments  such  as  CAD,  the  most 
up-to-date  specifications  on  mechanical  parts  must  be  available  in  a  shared  manner  to  promote  coop¬ 
eration  and  facilitate  productivity.  In  this  case,  the  URBS  policies  may  not  protect  sensitive  personal 
information,  but  may  protect  information  which  is  equally  sensitive  from  a  business  perspective.  Fur¬ 
ther,  the  strength  of  URBS  policies  is  that  they  are  intended  to  be  specific  to  individuals,  determined 
based  on  individual  needs  and  special  conditions.  It  is  this  idiosyncrasy  that  makes  it  so  difficult  to 
address  the  issues  of  designing  consistent  URBS  policies. 

The  ultimate  responsibility  for  URBS  policies  is  on  the  shoulders  of  the  application’s  management 
personnel  and  organization’s  data  security  officer.  In  order  to  have  these  critical  policy  makers  take  full 
advantage  of  URBS,  tools  and  techniques  must  be  made  available.  Design  techniques,  similar  to  the  ones 
presented  in  Section  2,  are  critical  to  allow  software  and  security  engineers  to  accurately  and  precisely 
specify  their  applications’  functional  and  security  requirements.  To  augment  these  techniques,  a  suite  of 
tools  is  required,  that  can  provide  many  different  and  diverse  analytical  capabilities.  These  tools  should 
automatically  alert  these  engineers  when  potential  conflicts  occur  during  the  creation  or  modification  of 
roles,  role  hierarchies,  and  end-user  authorizations,  thereby  heading  off  possible  inconsistencies.  There 
must  also  be  tools  that  provide  on-demand  analyses,  allowing  engineers  to  gauge  their  realized  software 
and/or  security  requirements  against  their  specifications.  Once  the  URBS  policy  has  stabilized,  the 
tools  should  provide  the  means  to  capture  and  realize  it  via  a  URBS  enforcement  mechanism  that 
is  automatically  generated.  The  overriding  intent  is  to  finish  with  an  object-oriented  system  that 
embodies  a  strong  confidence  with  respect  to  the  URBS  policy  and  its  attainment. 

The  remainder  of  this  section  explores  these  and  other  issues  from  two  perspectives.  Prom  the  user- 
role  definition  perspective,  in  Section  3.1,  we  examine  the  consistency  issues  that  must  be  attainable 
as  roles  and  dependencies  among  roles  are  created  and  modified.  FVom  an  authorization  perspective, 
in  Section  3.2,  we  investigate  similar  consistency  issues  as  actual  individuals  (people)  are  authorized  to 
play  certain  roles  within  an  object-oriented  application  or  system. 

3.1  Consistency  for  User  Roles 

When  a  security  engineer  is  creating  and  modifying  user  roles  for  an  object-oriented  application  or 
system,  the  consistency  of  the  definition  is  critical  in  order  to  insure  that  the  URBS  policy  is  maintained. 
This  is  a  time-oriented  issue;  changes  to  the  policy  are  needed,  especially  in  object-oriented  situations, 
where  evolution  and  extensibility  are  the  norm.  Regardless  of  the  changes  that  are  made,  there  must 
be  assurance  that  the  privileges  of  each  user  role  are  adequate  to  satisfy  the  functions  of  the  user  role. 
Moreover,  the  privileges  must  not  exceed  the  required  capabilities  of  the  user  role,  to  insure  that  misuse 
and  corruption  do  not  occur.  In  addition,  since  user  roles  are  often  interdependent  upon  one  another 
(e.g.,  our  approach  uses  a  hierarchy),  it  may  be  necessary  to  examine  their  interactions  to  insure  that 
privileges  aren’t  being  passed  inadvertently  from  role  to  role,  yielding  a  potentially  inconsistent  state. 

There  are  many  different  scenarios  of  evolution  that  must  be  handled.  A  security  engineer  may 
create  new  roles  for  a  group  of  potential  users  or  may  create  specific  roles  that  are  targeted  for  a 
particular  end-user  for  a  special  assignment  under  a  special  circumstance.  Each  newly  created  role 
must  be  internally  consistent  so  that  no  conflicts  occur  within  the  role  itself.  This  is  also  true  when 
a  role  is  modified,  which  we  term  intra-role  consistency.  For  the  object-oriented  case,  when  privileges 
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Me  assigned  to  each  role,  this  assignment  implicitly  grants  object-access  privileges  to  the  role  holder 
(end-user).  Such  an  assignment  process  utilizes  the  least-privilege  principle  which  grants  only  necessary 
access  privileges  but  no  more.  Only  those  privileges  that  are  relevant  to  the  user  role  are  permitted.  The 
policy  IS  intentionally  very  conservative  and  restrictive,  requiring  that  the  URBS  policy  be  validated 
by  either  ^e  software  engineer,  security  engineer,  or  both.  In  some  organizations,  there  are  dedicated 
secunty  officers  who  possess  the  ultimate  responsibility  with  respect  to  security  requirements/policies 
for  all  applications.  ' 

To  complement  the  least-privilege  principle,  user  roles  often  must  satisfy  mutual  exclusion  eonii- 
tons.  Here,  there  must  be  a  careful  balance  between  permitting  access  to  certain  objects  while  simul¬ 
taneously  prohibiting  access  to  other,  special  objects.  Mutual  exclusion  is  a  strong  URBS  concern, 
dictated  by  an  organization’s  rules  and  regulations,  or  by  government  law.  For  example, 
in  HCA,  an  mdividual  assigned  the  role  of  Pharmacist  can  read  the  prescription  of  a  patient,  update 
the  number  of  refills  after  processing  the  prescription,  but  is  explicitly  prohibited  from  modifying  the 
dosage  or  drug  of  the  prescription.  Thus,  access  and  modification  to  some  information  is  balanced 
against  ^elusion  from  other  information.  This  strong  mutual  exclusion  situation  is  clearly  observed  by 
the  medical  profession  and  is  mandated  by  law.  The  URBS  policy  must  ensure  that  security  require- 
1^  Violated.  In  our  approach,  these  mutual  exclusions  are  supported  in  the 

URDH  by  allowing  the  security  engineer  to  define  prohibited  methods.  Such  a  technique  is  extremely 
important  to  insure  that  the  objects  that  are  referenced  within  the  prohibited  privileges  of  a  role  do 
not  overlap  or  contradict  with  the  objects  that  are  accessible  by  all  of  the  assigned  privUeges. 

.  c  -x-  *0  consider  the  interdependence  of  user  roles,  such  as  within  our  user-role 

definition  hierarchy  (URDH),  the  internal  consistency  as  captured  by  least  privilege  and  mutual  exclu¬ 
sion,  must  be  e^anded  to  inter-role  consistency.  In  any  approach  with  interdependence  among  user 
roles,  there  is  the  potential  for  user  roles  to  acquire  privileges  (both  positive  and  negative  privileges) 
from  other  roles.  This  acquisition  process  must  be  clearly  understood  by  the  security  engineer,  partic- 
u  ar  y  in  an  environment  where  URBS  policy  is  constantly  changing.  In  addition,  to  provide  versatile 
design  tools  to  the  security  engineer,  it  should  be  possible  to  establish  superior,  inferior,  and  equivalence 
relationships  among  different  user  roles.  These  relationships  must  also  be  validated  as  privileges  are 

defined,  acquired,  and  change.  From  the  perspective  of  the  entire  URDH,  intra-hierarcky  consistency 
must  be  attained.  ^ 

To  support  a  URBS  definition  process  with  least  privilege  and  mutual  exclusion,  the  security  engi¬ 
neer  must  be  provided  with  a  set  of  techniques  and  tools.  Prom  a  purely  definitional  perspective,  there 
must  be  tools  for  meaningful  comprehension  on  user  roles,  including  all  positive  privUeges,  negative 
privileges,  and  relationships  to  other  roles.  Such  a  role-profiling  tool  provides  a  quick  and  comprehen- 
j 'c  .  capabilities  of  each  user  role,  supporting  intra-role  consistency.  Once  any  initial 

definition  has  occurred,  there  must  be  tools  to  support  analyses  for  both  internal  and  inter-role  consis- 
tency.  Even  if  one  has  given  minimal  access  to  certain  objects  to  a  specific  user  role,  it  is  the  nature 
of  object-oriented  applications  to  be  coupled.  Hence,  even  a  minimal  set  of  explicit  object  accesses 
might  coraespond  to  implied  access  to  a  wider  set  of  objects.  Automated  analysis  tools  are  necessary 
for  an  exhaustive  search  to  follow  all  possible  object  access  paths  as  required  by  all  of  the  positive  and 
negative  privileges  in  the  security  definition.  Conflicts  discovered  during  the  search  will  have  to  be 
resolved  by  the  application’s  management  personnel  and  security  engineer.  Feedback  must  be  available 
to  assist  the  human  designer  in  arriving  at  a  viable  resolution  to  any  conflicts  or  inconsistencies. 

nalyses  are  available  in  the  ADAM  environment  for  the  application’s  content /context,  and  for  its 
security  requirements  [5,  9],  and  is  supported  via  the  profiles  reviewed  in  Section  2.  Capabilities  analyses 
in  ADAM  allows  the  security  engineer  to  review  the  permissions  (as  inferred  by  the  assigned  and 
prohibited  methods)  given  to  a  chosen  URDH  node  on  an  application’s  OTs,  methods,  and/or  private 
data,  thereby  supporting  the  intra-role  or  internal  consistency  of  the  URBS  policy.  Authorization 
analyses  in  ADAM  allows  the  security  engineer  to  investigate  which  user  roles  have  what  kinds  of 
access  to  different  aspects  of  an  application  (i.e.,  an  OT,  a  method,  or  a  private  data  item).  Through 
these  analyses,  inter-role  and  mtra-hierarchy  consistency  can  be  understood,  since  the  security  engineer 
can  exa^ne  the  multiple  roles  that  access  a  component  of  the  application.  Overall,  the  analyses  are 
intended  to  provide  a  crucial  first  step  in  the  important  process  of  assurance  with  respect  to  the 
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consistency  and  correctness  of  the  URBS  policy. 

3.2  Consistency  in  End-User  Authorization 

When  considering  consistency  in  end-user  authorization,  the  assumptions  of  the  policy  must  be  clearly 
understood.  For  example,  in  any  organization  where  end-users  can  be  assigned  multiple  roles,  there  are 
two  scenarios  of  permissible  behavior  against  an  application:  1.  End-users  can  only  play  exactly  one  role 
at  any  given  time;  and,  2.  End-users  can  play  multiple  roles  concurrently  at  any  given  time.  The  first 
assumption  does  not  cause  significant  problems,  since  for  an  end-user,  only  one  role  is  active.  As  long 
as  that  role  is  intra-role  and  inter-role  consistent,  there  is  no  problem.  However,  the  first  assumption 
alone  does  not  provide  the  needed  security,  but  instead  raises  a  number  of  interesting  issues  that  arc 
addressed  by  the  second  assumption. 

Namely,  when  an  end-user  that  may  play  multiple  user  roles  simultaneously  at  any  given  time  within 
an  application  and  within  the  organization,  a  level  of  end-user  consistency  in  introduced.  Similar  in 
concept  to  inter-role  consistency,  in  end-user  consistency,  the  privileges  of  the  multiple  roles  for  a  single 
end-user  are  aggregated.  Such  an  aggregation  may  introduce  conflicts  between  positive  and  negative 
privileges  that  span  multiple  roles.  Further,  when  a  new  privilege  is  assigned  to  an  established  user 
role,  with  internal  and  inter-role  consistency  assured,  it  may  still  impact  the  end-user  consistency.  Also, 
when  a  new  role  is  assigned  to  an  end-user,  it  too  may  conflict  with  existing  concurrent  roles  for  that 
user.  While  all  of  these  different  problems  are  the  responsibility  of  the  security  engineer,  given  the 
dynamics  and  complexity  of  a  real-world  organization,  it  would  be  very  difficult  for  that  person  to 
accomplish  this  task  without  appropriate  and  effective  tools. 

Automated  tools  are  needed  for  the  user  authorization  model  in  a  secure  data  system  so  that  no 
URBS  policy  violations  are  possible  for  any  end-user  in  the  organization.  Thus,  the  tcchniques/tools 
in  Section  3.1  must  be  extended  to  consider  end-user  consistency,  allowing  the  security  engineer  to 
focus  on  the  conflicts  of  privileges  for  single  end-users  with  multiple  concurrent  roles.  Tools  from  both 
definitional  and  analytical  perspectives  are  required.  ADAM  provides  such  tools  as  extensions  of  the 
URDH  case  described  in  Section  3.1,  since  they  (in  most  cases)  repeatedly  call  the  “relevant”  URDH 
analyses  for  each  user  role  assigned  to  an  individual. 

Once  intra-role,  inter-role,  and  intra-hierarchy,  and  end-user  consistency  have  been  attained  at  a 
definitional  level,  there  are  two  remaining  requirements: 

1.  the  defined  URBS  policy  must  be  captured  within  the  object-oriented  application;  and 

2.  once  captured,  at  both  compile  time  and  runtime,  the  policy  must  be  enforced. 

For  both  requirements,  our  previous  work  on  URBS  enforcement  approaches  [2,  3,  4]  is  intended  to 
support,  in  part,  the  consistency  and  assurance  of  the  URBS  policy.  However,  as  we  will  see  in 
Section  4,  through  software  architectures  we  can  provide  a  higher  level  of  assurance  regarding  the 
guarantee  that  must  be  met  concerning  a  defined  URBS  policy  for  an  object-oriented  application. 
Our  approach  will  involve  the  encapsulation  of  the  URBS  policy  and  enforcement  mechanism  into  a 
software  component  that  is  part  of  a  larger  software  architecture  for  the  support  of  an  object-oriented 
application  with  embedded  security.  It  is  our  hope  that  such  a  software  component  can  be  effectively 
managed,  controlled,  and  most  importantly  validated. 

4  Software  Architectures  and  URBS  Mechanisms 

To  understand  our  efforts  in  this  section,  it  is  critical  that  we  define  our  assumptions  concerning 
the  composition  of  object-oriented  software.  Basically,  the  crux  of  an  object-oriented  system  is  an 
underlying,  shared  object  type/class  library  to  represent  the  kernel  or  core  functionality,  c.g.,  in  HCA, 
the  OTs  given  in  Figure  1.  Once  such  a  library  has  been  developed,  other  software  engineers  will  design 
and  develop  tools  against  it,  e.g.,  in  HCA,  a  patient  user  interface,  an  admissions  subsystem,  a  software 
component  to  allow  a  blood  analysis  system  to  send  test  results  directly  into  patient  records,  etc.  Thus, 
in  our  approach,  end-users  are  not  able  to  write  programs  to  access  data  directly,  which  is  instead  the 
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bark  discipline  whose  intent  is  to  force  software  engineers  to  step 

iMeraS^  trada^onal  algorithm/data  structure  perspective  and  view  software  as  I  collection  If 

^?mDone?ts?T°“H  ;  (within  each  component)  and  globally  (between 

sSr^atiorT  °S  “teractions,  the  key  consideration  is  to  identify  the  communication  and 

tw^a  r  functionality  of  the  system  to  be  precisely  captured, 

needs  oerformTn^  /  definition  process,  software  architectures  permits  database 

aretritr^  ^  requirements,  to  be  considered.  These  considerations 

are  critical  as  larp-scale  object-oriented  software  becomes  more  dominant  in  industry. 

URBreSem^nt  “  to  present  and  critique  multiple  software  architectural  variants  for 

Id  iurt?  W^^  '  ^  attainment  of  consistency 

^tlT  1^?  previous  efforts  focused  on  detailed  URBS  enforcement  approaches  [2,  4],  our 

intent  in  this  section  is  to  step  back  from  this  work  and  consider  the  ways  that  these  approaches  can  fit 

we°sta°rt°tSs  SnT’'^-  "" enforcement  for  object-oriented  software.  Nevertheless, 
for  our  T^mg  two  of  our  previous  approaches,  since  they  set  the  context 

for  our  subsequent  discussion  related  to  software  architectures.  Then,  we  focus  on  two  architectural 

Se  finll  s  rf- "  For  both  styles,  multiple  variants  are  presented  and  analyzed. 

wfthresi^tTtheaSrvJ  comparing  and  contrasting  their  capabilities 

Tnli  ^7  *  ^  °  ®“FPort  consistency/assurance,  the  impact  of  evolving  both  the  security 


4.1  UCLA  a.nd  GEA  Approaches 

s^e^U^  previously  presented  [2,  4],  and  are  briefly  summarized  in  this 

mechanism  f  inheritance  to  implement  the  enforcement 

T  ^  “f  hierarchy  for  the  URDH.  For  each  URDH  node,  positive  method  access  is 

SsHhe  clt^t  uTr  application  executes,  method  invocations  must  validate 

against  the  current  UR  by  checking  the  method  call  against  the  URDH  class  library  At  runtime  a 

roTehJlIh®''  “  /  invocation  of  the  appropriate  methods  that  are  used  to  verify  whether  the  user’s 
class  lib  permissions.  Prom  an  evolvability  perspective,  as  URs  are  added,  only  the  URDH 

URDH^cIms  w  Similarly,  if  a  UR’s  assigned/prohibited  methods  change,  only  the 

a  (GEA)  incorporates  concepts  of  reusable  template  classes  to  realize 

the  UR  of  the  r"'"  the  URBS  policy.  In  GEA,  when  a  method  is  invoked, 

the  UR  of  the  current  user  is  checked  to  verify  if  access  can  be  granted.  If  not,  an  exception  is  raised 
and  processed.  When  the  GEA  security  template  is  utilized  by  a  class  that  n“d?3:eTtion 

insures  that  a  check  method  is  called  when  a  user  attempts  to  invoke  any  method  on  the  active  instance 
If  the  user  s  role  doesn  t  perrmt  access,  an  exception  is  thrown,  and  the  invoking  method  will  not  allow 
Its  functionahty  to  be  executed  and  affect  instances.  There  are  many  advantages  to  GEA  First  the 
code  in  the  GEA  security  template  is  hidden  from  the  software  engineer.  Software  reSet  promoted 

been  established,  the  remaining  code  works  based  on  that  initialization.  While  the  code  for  OT/cIms 
methods  must  be  changed,  the  changes  are  hidden  from  the  software  engineer  writing  tools.  ' 

4.2  Architectural  Alternatives 

architecture  perspective,  the  URBS  enforcement  mechanism  can  be  located  in  many 
different  places  and  function  in  many  different  ways.  It  can  be  integral  part  of  the  OT/class  library,  to 
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be  automatically  included  when  any  tool  utilizes  a  portion  of  the  library.  Alternatively,  it  may  be  an 
independent  and  self-contained  library  that  is  compiled  with  each  application  tool,  similar  in  concept  to 
a  math  library  being  included.  Other  choices  could  have  a  separately  executing  process  through  which 
all  security  requests  must  be  handled.  Regardless  of  the  choice,  the  key  underlying  characteristics  must 
be  the  attainment  of  high  consistency  and  assurance.  This  must  be  balanced  against  concerns  that 
seek  to  minimize  the  amount  of  knowledge  a  software  engineer  must  have  on  URBS  and  to  yield  an 
approach  that  is  evolvable,  since  object-oriented  software  (and  its  security  policy)  must  be  conducive 
to  change. 

To  standardize  terminology  regarding  the  assumptions  on  object-oriented  systems  given  in  the 
introduction  of  Section  4,  we  define: 

AppCL:  Represents  the  shared,  object-oriented  class  library  for  an  application,  e.g.,  the  HCA  classes  as 
shown  in  Figure  1. 

SCL:  Represents  the  security  class  library  for  an  application  that  embodies  URBS  definition  and  en¬ 
forcement,  e.g.,  the  classes  generated  from  the  URBS  definition  as  given  in  Figure  3. 

TCL:  Represents  the  tool  class  library  for  individual  tools  (e.g.,  a  patient  GUI,  an  admissions  subsystem, 
etc.)  against  the  application. 

Note  that  when  the  L  is  dropped  from  either  AppCL  or  SCL,  we  are  referring  to  an  individual  class 
of  the  library.  Using  these  basic  assumptions,  the  remainder  of  this  section  explores  layered  systems, 
and  communicating  processes  and  the  client/server  paradigm,  as  software  architectural  alternatives 
for  URBS  enforcement.  For  each  alternative,  multiple  variants  are  presented  and  discussed,  and  then 
analyzed  with  respect  to:  the  level  of  consistency  and  assurance  that  each  variant  provides  for  security 
concerned  users;  the  dimensions  of  evolvability,  which  is  critical  since  both  the  URBS  policy  and  object- 
oriented  software  tend  to  be  dynamic  over  time;  and,  the  impact  of  the  absence/presence  of  a  persistent 
store  (i.e.,  database). 

4.2,1  Layered  Systems 

Layered  systems  are  a  classic  technique  for  software  architectures,  where  layers  of  functionality  are 
built  upon  one  another  to  provide  a  controlled  environment  for  access  to  information.  There  are  two 
layered  system  variants  for  URBS  enforcement:  LSI  an  application-based  approach  and  LS2  a  class- 
based  approach.  In  both  variants,  security  is  at  the  level  of  the  method  invocation,  which  is  processed 
by  the  SCL  prior  to  its  actual  runtime  call  against  an  instance  of  a  class  in  the  AppCL.  In  either  case,  the 
SCL  can  be  either  the  UCLA  or  GEA  approach.  In  the  LS2  variant,  each  individual  class  handles  the 
method  invocations  that  apply  to  its  instances  as  they  are  received  by  the  various  tools.  The  difference 
is  one  of  granularity.  In  LSI,  security  is  managed  at  the  application  level  overall,  and  once  it  has  been 
determined  that  the  tool  can  invoke  a  method,  it  is  passed  through  to  the  involved  instance  or  instances. 
In  LS2,  security  is  managed  at  the  instance  level  only.  This  may  cause  a  problem  when  instances  refer 
to  other  instances,  i.e.,  a  security  request  by  the  tool  involves  multiple  instances  of  either  the  same  or 
different  classes. 

Variant  LS2:  Class-Based  Approach 

+ - + 

+ - +  I  SC  I 

|TCL|<— - — ->|  +— — +  I  etc  ... 

+ - +  I  lAppCl  I 

+ - >1  + - +  I 

1  + - + 

▼  + - - — + 

+ - +  I  SC  I 

|TCL|< - >1  + - ►  I 

+ - +  + - +  I  I 

|TCL|< - >1  + - ►  I 

+ - +  + - ^ 


231 


Rom  a  consistency/assurance  perspective,  it  appears  that  LSI  has  the  advantage,  since  all  of  the 
method  invocations  must  pass  forward  through  the  security  layer  for  authentication  and  all  results 
mus  pass  ^  ac  or  enforcement.  That  is,  when  utilizing  a  tool  and  its  various  options,  users  end  up 
cdlmg  various  methods  based  on  his/her  UR  and  under  the  control  of  the  tool.  However,  variant  LS2’s 
view  of  flowing  each  instance  to  maintmn  its  own  security  is  superior  to  LSI  from  a  software  evolution 
perspective,  since  changes  to  the  security  policy  of  one  class  may  not  effect  the  policies  of  other  classes 
When  a  persistent  store  is  included  into  the  mix,  LSI  has  the  edge,  since  all  accesses  must  proceed  via 

a  coi^on  security  layer.  In  LS2,  there  are  potential  concurrent  access  issues  if  some  or  all  AppCs  are 
directly  connected  to  a  database. 


4-2.2  Communicating  Processes  -  C/S  Paradigm 

In  the  communicating  processes  approach  to  URBS  enforcement,  a  process-oriented,  client/server  (C/S) 
paradigm  is  adopted.  TCL,  SCL,  and  AppCL  are  integrated  into  single  and/or  multiple  processes,  resulting 
in  a  total  of  four  different  variants:  »  “6 

Variant  CPI;  Base  case  with  a  single  process  that  combines  TCL,  SCL,  and  AppCL. 

Variant  CP2:  Multi-process  with  TCL  and  SCL  combined  into  a  client  that  is  served  by  AppCL. 

Variant  CPS:  Multi-process  with  TCL  a  client  of  a  combined  SCL  and  AppCL  server. 

Variant  CP4!  TCL,  SCL,  and  AppCL  are  independent  processes  that  interact  in  a  two  level  C/S  archi- 
tecture.  ‘ 

AppCL  represents  the  minimal  subset  needed  by  the 
tool/TCL  to  support  its  functionality  and  enforce  its  security  policy. 

Variant  CPI  is  similar  to  LSI,  but  each  tool  is  compiled  as  a  separate,  standalone  process.  In  this 
case  SCL  and  AppCL  are  analogous  to  a  math  library  that  is  compiled  when  needed  by  the  software, 
unctionally,  within  each  process,  TCL  sends  method  invocations  to  SCL  which  in  turn  passes  them 
hrough  to  AppCL  according  to  the  URBS  policy.  Results  are  passed  back  from  AppCL  to  SCL,  which 
may  then  filter  the  response  before  passing  them  back  to  TCL.  Note  that  SCL  and  AppCL  in  each  process 
represents  those  subsets  of  the  class  libraries  needed  by  each  tool,  and  that  either  UCLA  or  BEA  can 
DC  the  entorcement  mechanism  realized  within  SCL. 

Variant  CPI:  Single  Process  -  Base  Case  ~  no  C/S 

♦ - + - ^ - ^  ^ - - - - - -  - - ^ _ ^ _ _ 

I  TCLl  I  SCL  I  AppCL  1  |  TCL2  |  SCL  |  AppCL  | 


I  TCLn  I  SCL  |  AppCL  | 


From  a  consistency /assurance  perspective,  it  would  be  a  requirement  that  each  tool  be  compiled 
into  a  single  process  wi^th  SCL  and  AppCL  included.  Thus,  the  level  of  assurance  and  consistency  that 
IS  attained  is  tied  to  the  accuracy  and  completeness  of  the  URBS  policy.  But,  note  that  since  each 
tool  may  have  a  only  a  portion  of  the  overall  URBS  policy,  consistency  becomes  a  prominent  concern 
whenever  changes  need  to  be  made,  i.e.,  updates  must  be  made  to  all  tools  that  use  the  portion  of  the 
policy  that  changed.  Extensibility  in  CPI  presents  major  problems.  While  it  is  easy  to  add  new  tools 
and  new  tools  when  added  won’t  effect  existing  tools,  changes  to  either  the  SCL  or  AppCL  definitely 

u  localizable  to  data  files  that  can  be  dynamically  loaded, 

then  URBS  policy  changes  should  be  supportable.  But,  if  the  changes  require  the  SCL  to  be  rebuilt 
unless  the  compilation/runtime  environment  supports  dynamically  linkable  class  libraries,  evolution 
will  also  require  the  recompilation  of  all  affected  tools.  More  significantly,  changes  to  AppCL  have  a 
dramatic  impact  for  all  affected  tools  and  all  affected  portions  of  the  SCL.  In  addition,  since  the  AppCL 

IS  compiled  with  each  tool,  it  is  unclear  whether  this  approach  can  successful  work  when  AppCL  is 
linked  to  a  database.  « 

Variants  CP2  Md  CPS  are  both  multi-process  approaches  with  clearly  defined  client/server  separation 
of  functionality.  In  CP2,  each  client  is  a  TCL/SCL  pair  that  interacts  with  a  shared  AppCL  server.  In  this 
case,  each  SCL  represents  that  subset  of  the  overall  URBS  policy/enforcement  that  is  needed  by  the 
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specific  tool,  i.e.,  if  a  tool  only  uses  one  or  two  classes,  the  SCL  is  that  subset  of  the  overall  URBS  policy 
for  those  needed  classes.  Thus,  the  URBS  policy /enforcement  is  specifically  bound  to  each  tool.  Like 
CPI,  the  level  of  consistency/assurance  that  is  attained  depends  on  the  realization  of  the  URBS  policy 
within  SCL.  The  fact  that  the  policy  is  spread  across  multiple  tools  does  introduce  potential  consistency 
concerns  when  changes  to  the  policy  are  made.  Changes  to  the  URBS  policy  impact  SCL  in  the  same 
way  as  CPI.  However,  there  are  improvements  in  changes  to  AppCL;  since  it  is  in  a  separate  process, 
careful  planning  will  allow  some  changes  to  have  no  impact  on  the  joint  TCL/SCL  clients.  Drastic 
changes  to  AppCL  (e.g.,  deletion  of  classes,  additions  of  classes,  major  functionality  upgrades)  are  likely 
to  impact  SCL  thereby  requiring  the  recompilation  of  tools.  As  conceptualized,  both  UCLA  and  GEA 
are  tightly  linked  to  AppCL,  making  them  inappropriate  for  CP2.  From  a  database  perspective,  the 
presence  of  a  persistent  store  within  or  coupled  to  AppCL  should  be  supportable  and  invisible  to  the 
clients. 


Variant  CP2:  Mnlti-Process/Shared  AppCL 

+-- - +™ — + 

I  TCLl  I  SCL  |< - ► 

+ - + - +  I 

V 
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Variant  CP3:  Multi  Process/Shared  SCL/AppCL 
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In  CPS,  the  client  is  each  individual  tool  (TCL),  with  the  server  containing  the  joint  SCL/AppCL 
functionality.  By  decoupling  the  URBS  policy /enforcement  from  each  tool,  the  tool  becomes  relatively 
independent  from  changes  to  the  security  policy.  Each  tool  simply  makes  requests  to  the  joint  server  and 
the  way  that  those  requests  are  satisfied  can  be  hidden  using  typical  object-oriented  design  approaches. 
Thus,  unlike  CPI  and  CP2,  changes  to  the  URBS  policy  shouldn’t  impact  tool  code.  The  placement  of 
the  entire  URBS  policy /enforcement  in  one  location  greatly  improves  consistency  and  assurance,  since 
all  changes  to  the  policy  occur  in  one  place.  This  is  superior  to  both  the  CPI  and  CP2  variants.  Like 
CPI,  SCL  can  be  realized  with  UCLA  or  GEA. 

Changes  to  the  URBS  policy  and/or  the  AppCL  may  require  that  the  joint  server  be  periodically 
rebuilt,  i.e.,  changes  to  AppCL  may  still  impact  SCL.  As  long  as  those  changes  don’t  alter  the  signatures 
of  the  various  methods/protocols  that  that  tools  utilize,  there  should  be  no  impact  on  the  tool  code. 
B^ically,  the  dimension  of  evolvability  allows  the  easy  addition  of  new  tools  or  new  users  utilizing 
existing  tools.  Database  integration  of  AppCL  is  the  same  as  CP2.  However,  from  a  performance 
perspective,  since  all  security  requests  are  processed  by  a  joint  server,  there  is  the  potential  that  that 
server  will  become  a  bottleneck  as  the  throughput  of  the  system  increases,  i.e.,  with  more  tools,  or 
more  users  utilizing  existing  tools. 

Variant  CP4  is  presented  as  a  means  to  alleviate  the  remaining  consistency,  assurance,  and  perfor¬ 
mance  concerns  of  CP3.  Variant  CP4  is  truly  a  multi-process,  multi-leveled,  client/server  architecture. 
In  this  case,  each  TCL  is  a  client  to  an  SCL  server  that  provides  security  for  the  entire  AppCL,  i.e.,  the 
SCLi’s  are  replicated.  Each  SCL,  in  turn,  is  a  client  to  the  shared  AppCL.  Like  CP2,  SCLi’s  separation 
from  AppCL  negates  UCLA  and  GEA  as  appropriate  solutions. 

The  relationship  between  each  TCLi .  j  and  its  respective  SCLi  acquires  the  advantages  of  CPS  with 
respect  to:  the  independence  of  the  tool  code  from  SCL  (and  AppCL);  the  ability  to  add  new  tools;  and, 
the  lack  of  impact  of  changes  to  SCL  (and  AppCL)  on  the  tool  code.  The  multiple  SCL  servers  to  the 
TCL  clients  also  alleviate  a  level  of  performance  concerns  from  CPS,  allowing  more  SCLs  to  be  added  as 
more  tools  (and  hence,  more  users)  need  to  be  served.  Consistency  and  assurance  in  CP4  maintain  the 
benefits  of  CPS  over  the  other  two  variants:  each  SCL  has  the  entire  URBS  policy /enforcement,  so  any 
changes  to  the  policy  can  be  made  and  replicated.  CP4  still  may  have  performance  bottlenecks  with 
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respect  to  access  to  AppCL.  But  those  bottlenecks  have  now  been  delineated  from  the  SCLs,  and  can  be 
handled  by  replacing  AppCL  by  a  distributed  object-oriented  class  library  with  database  support. 


Variant  CP4:  Mtati-Process  -  C/S  -  Replicated  SCL/Shared  AppCL 
+ - + 
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4.3  Critiquing  the  Architectural  Variants 

This  sections  summarizes  the  evaluative  statements  for  the  six  variants  into  a  cohesive  discussion 
that  clearly  compares  and  contrasts  their  capabilities.  Our  first  critique  is  based  on  the  location  and 
structure  of  the  URBS  policy /enforcement  within  each  variant.  This  is  important  from  a  consistency 
and  assurance  perspective.  In  LSI,  CPS,  and  CP4,  the  entire  policy /enforcement  is  present  and  captured 
within  SCL  (replicated  in  CP4).  In  LS2,  CPI,  and  CP2,  the  policy  is  partially  captured,  to  the  level 
required  by  the  tool/TCL.  Froin  a  consistency  perspective,  whenever  the  URBS  policy  changes,  there 
must  be  assurance  that  the  policy  is  still  enforced  by  all  existing  tools.  The  centralized  nature  of  LSI, 
CPS,  and  CP4,  lends  itself  to  a  maintenance  of  the  assurance  after  the  change.  In  the  case  of  LS2,  CPI, 
and  CP2,  the  tools/TCLs  must  be  recompiled  to  insure  that  all  SCLs  are  updated.  Also,  since  the  policy 
is  spread  across  multiple  SC/AppC  pairs  (in  LS2)  or  is  unique  to  each  process  (in  CPI  and  CP2),  there  is 
a  chance  that  inconsistencies  can  arise  that  impact  on  assurance,  if  all  recompilations  are  not  carefully 
performed. 

Our  second  critique  involves  the  impact  of  changes  on  each  variant  when  either  the  security  policy 
or  application  classes  are  changed.  For  LSI,  LS2,  CPS,  and  CP4,  as  long  as  accepted  object-oriented 
design  techniques  (abstraction,  representation  independence,  etc.)  have  been  followed,  it  should  only  be 
necessary  to  recompile  SCLs  and/or  AppCLs;  there  should  be  no  impact  on  tools/TCL.  In  fact,  depending 
on  the  actual  enforcement  approach  (UCLA,  GEA,  or  other),  two  situations  might  occur:  when  the 
security  policy  changes,  SCL  or  SCL/ AppCL  may  need  recompilation;  and,  when  some  application  classes 
change,  AppCL  or  AppCL/SCL  may  need  recompilation.  Both  situations  are  dependent  on  the  interre¬ 
lation  of  the  enforcement  approach  to  the  application  classes.  For  other  variants:  when  the  policy 
changes,  CPI  and  CP2  must  be  rebuilt,  since  SCL  is  within  the  same  process/client  as  the  tool/TCL; 
when  some  application  classes  change,  each  tool/TCL  in  CPI  that  uses  the  subset  that  has  changed  must 
be  recompiled.  CP2  behaves  in  a  similar  fashion  to  CP3  and  CP4  for  changes  to  the  AppCL. 

A  third  critique  involves  the  utility  of  our  existing  enforcement  mechanism  approaches  (UCLA  and 
GEA)  for  the  architectural  variants.  As  currently  designed,  both  UCLA  and  GEA  are  tightly  coupled 
to  AppCL.  That  is,  it  would  be  difficult  to  cleanly  and  completely  separate  out  the  SCL  from  the  AppCL. 
This  being  the  case,  it  is  apparent  that  some  variants  are  more  conducive  to  the  two  approaches  than 
others.  Namely,  LSI,  LS2,  CPI,  and  CPS,  can  all  function  with  either  UCLA  or  GEA  as  SCL,  since 
SCL  is  linked  to  AppCL.  On  the  other  hand,  neither  CP2  nor  CP4  can  support  UCLA  and  GEA  for  the 
AppCL,  without  changes  to  UCLA  and  GEA  that  decisively  separate  the  security  policy /enforcement 
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from  the  application  class  library.  It  will  be  necessary  to  either  rework  UCLA  and  GEA,  or  design  new 
enforcement  variants  to  support  CP2  and  CP4. 

Our  final  critique  focuses  on  the  case  when  database  interactions  are  required  from  the  AppCL  to 
a  persistent  store.  LSI,  CP2,  CPS,  and  CP4  all  separate  AppCL  from  the  tools/TCL,  meaning  that  a 
persistent  store  can  be  easily  supported.  LS2  and  CPI  have  problems,  since  each  approach  utilizes  a 
partial  AppCL,  for  only  those  classes  that  are  needed  by  each  tool/TCL.  Thus,  for  LS2  and  CPI,  if  database 
access  was  to  occur,  it  would  likely  require  that  the  tools  interact  to  synchronize  their  requests,  which 
raises  many  major  roadblocks.  From  a  performance  perspective,  of  the  variants  that  can  easily  support 
a  database,  all  but  CP4  have  potential  bottlenecks  at  either  the  SCL,  AppCL,  or  both.  CP4  offers  the 
best  solution,  and  if  needed,  the  AppCL  can  be  expanded  to  a  distributed  object-oriented  database  to 
satisfy  increases  in  either  tools  or  users. 

5  Pertinent  Research  Works  in  This  Area 

One  recent  effort  in  security  for  distribute  objects  raises  the  issue  of  high  assurance  as  a  key  require¬ 
ment  [7].  Interestingly,  their  approach  utilizes  a  layered  architecture  as  a  mezms  to  realize,  in  part, 
security  for  interacting,  heterogeneous  objects.  The  layered  architecture  contains:  an  underlying  theo¬ 
retical  model  (layer  1),  a  meta-object  model  (layer  2)  as  a  primitive  object  zirchitecture  that  is  used  by 
an  abstract  object  model  (layer  3)  to  construct  distributed  object-oriented  applications  that  execute 
under  the  control  of  a  object-oriented  OS  environment  (layer  4).  High  assurance  in  their  approach  is 
predicated  on  the  innermost  theoretical  layer  that  provides  the  formal  methods  that  both  capture  and 
enforce  the  security  policy.  While  their  approach  is  object-oriented,  it  is  not  clear  whether  mandatory 
access  control  and/or  URBS  will  supported. 

There  has  also  been  a  significant  body  of  research  related  to  our  own  efforts  in  role-based  security 
for  object-oriented  systems.  While  none  of  these  efforts  specifically  address  consistency  and  assurance, 
there  are  inferences  that  can  be  drawn  regarding  their  potential  to  support  these  two  important  con¬ 
cepts.  In  [13,  14],  a  model  of  authorization  for  next-generation  database  systems  is  proposed.  In  this 
model,  an  authorization  is  defined  as  a  3-tupie  (s,  o,  a),  where  s  belongs  to  the  set  of  subjects,  o  belongs 
to  the  set  of  authorization  objects  in  a  system,  and  a  belongs  to  the  set  of  authorization  types.  The  roles 
are  defined  to  reduce  the  number  of  authorization  subjects  and  these  roles  form  a  role  lattice,  which  is 
similar  to  the  URDH  in  our  approach.  The  authorization  can  be  either  implicit/explicit,  strong/weak, 
or  positive/negative,  the  latter  of  which  is  similar  to  assigned/prohibited  methods.  Collectively,  their 
different  combinations  of  an  authorization  are  strongly  related  to  consistency /assurance,  since  they 
imply  a  certain,  specific  behavior  of  their  enforcement  process.  For  example,  implicit  authorization 
is  utilized  to  infer  additional  authorization  requirements  from  a  stored  set  of  base  privileges.  In  their 
approach,  the  allowable  combinations  of  the  different  authorizations  are  critical  for  a  security  engineer 
to  successful  define  his/her  policy. 

In  object-oriented  software,  the  presence  of  inheritance  introduces  a  potential  set  of  problems  that 
can  significantly  impact  on  the  security  [17].  For  example,  a  user  may  have  access  authorized  on  the 
subclass  B,  but  have  access  denied  on  the  superclass  A.  The  possible  problems  include  a  potential 
information  flow  from  a  higher  security  level  to  a  lower  level  by  inheriting  the  data  item  and  a  potential 
write-down  of  information  by  using  the  methods  defined  in  a  higher  level  to  update  the  data  item  in  the 
lower  level.  The  identification  and  resolution  of  inheritance  problems  are  critical  to  promote  consistency 
and  facilitate  assurance  of  the  URBS  policy /enforcement  mechanism.  Assigned/prohibited  methods, 
the  URDH,  and  analyses  can  be  used  by  a  security  engineer  to  avoid  inheritance  related  problems. 

Another  effort  has  examined  the  issues  and  considerations  that  are  required  to  integrate  a  collection 
of  existing,  locally  defined  user  views  into  a  single  shared  view  by  utilizing  an  object-oriented  approach 
[6].  When  different  views  are  merged,  there  is  the  requirement  to  identify  problems  and  inconsistencies, 
in  order  to  resolve  any  conflicts  that  might  exist  among  the  different  views.  The  merging  of  views  to 
resolve  conflicts  is  a  key  step  towards  a  consistent  and  assured  URBS  policy.  In  this  sense,  their 
work  is  similar  to  our  analysis  techniques.  However,  their  work  is  geared  towards  retrofitting  a  shared 
schema  on  top  of  existing,  different  views,  in  a  traditional  database  design  framework,  while  ours  is 
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b^ed  on  a  shwed  schema  and  the  allocation  of  privileges  from  this  common  basis  for  general  purpose 
object-oriented  applications.  ®  ^  ^ 

other  related  work  similar  to  our  own  efforts  has  been  in  the  object-oriented  programming  area. 
There  has  been  work  on  aspects  as  a  mechanism  for  object-oriented  data  models  to  extend  a  given  class 
with  new  capabilities,  including  roles  [15].  Another  effort  allows  different  classes  in  an  application  to 
have  different  subjective  views  [8].  An  extension  of  this  effort  has  focused  on  composing  these  subjective 
rj'J  ^  ^niplementation  support  in  C-f-h  [12].  None  of  these  efforts  approach  the  problem  from  either 

user  role-based  security  and/or  database  perspectives.  Thus,  it  is  unclear  what  level  of  consistency  and 
assurance  can  be  supported. 


6  Concluding  Remarks  and  Future  Work 

Consistency  and  assurance  for  object-oriented  systems  is  critical,  since  it  is  their  nature  to  evolve  and 
change  over  time.  When  both  the  application  class  library  and  the  URBS  policy  are  dynamic,  those 
changes  have  the  potential  to  significantly  impact  on  the  application’s  tools,  which  in  turn,  impacts  on 
actual  users  The  emerging  discipline  of  software  architectures  can  be  utilised  to  examine  alternative 
placements  for  the  tools,  URBS  policy,  and  application  class  library.  In  Sections  4.2  and  4.3,  we  pre¬ 
sented  and  critiqued  six  different  software  architectural  variants,  based  on  layered  systems  (2  variants) 
and  communicating  processes  (4  variants).  Based  on  our  analyses  it  appears  that  three  approaches 

dis  f-  Iv  ^  ^  policy/enforcement  and  application 

clMs  library  that  is  utilwed  by  multiple  application  tools;  CP3  a  client/server  solution  where  e^  tool 

!  1  ^  Tu "  consists  of  a  joint  process  containing  the  URBS  policy/enforcement  and 

ooLvf  T  ®  client  server  solution  where  each  tool  is  a  client,  the  URBS 

Lver^T^e'rnhr  Th?  *  application  class  Ubrary  has  its  own  independent 

Uorri  difference  between  the  three  occurs  when  a  database  interacts  with  the  applica¬ 

tion  class  library,  causing  a  performance  bottleneck.  In  such  a  situation,  CP4  lends  itself  to  most  Lilv 
evolving  from  a  centralized  to  a  distributed  object-oriented  database  ^ 

of  efforts  involve  the  attempt  to  exploit  software  architectures  as  part  of  the  generation 

of  a  URBS  enforcement  mechanism  by  ADAM.  Specifically,  we  have  been  transitioning  the  UCLA  and 

C++  ‘o  Ada95.  Once  in  Ada95.  we  can  then  exploit  its 
focus  on  Ss'"!  the  approaches  presented  in  Section  4.2,  with  our  likely  initial 

URBS  we  in  of  a  multi-processed  object-oriented  application  with 

chLisTo  the  URUr  r  hi  of  consistency  and  assurance  that  can  be  itained  when 

Changes  to  the  URBS  policy  and/or  application  class  library  occur. 
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Abstract  In  role-based  access  control  (RBAC)  per¬ 
missions  are  associated  with  roles,  and  users  are  made 
members  of  appropriate  roles  thereby  acquiring  the 
roles’  permissions.  The  principal  motivation  behind 
RBAC  is  to  simplify  administration.  An  appealing 
possibility  is  to  use  RBAC  itself  to  manage  RBAC, 
to  further  provide  administrative  convenience.  In  this 
paper  we  investigate  one  aspect  of  RBAC  administra¬ 
tion  concerning  assignment  of  users  to  roles.  We  de¬ 
fine  a  role-based  administrative  model,  called  URA97 
(user-role  assignment  ’97),  for  this  purpose  and  de¬ 
scribe  its  implementation  in  the  Oracle  database  man¬ 
agement  system.  Although  our  model  is  quite  differ¬ 
ent  from  that  built  into  Oracle,  we  demonstrate  how 
to  use  Oracle  stored  procedures  to  implement  it. 


1  INTRODUCTION 

Role-based  access  control  (RBAC)  has  recently  re¬ 
ceived  considerable  attention  as  a  promising  alterna¬ 
tive  to  traditional  discretionary  and  mandatory  ac¬ 
cess  controls  (see,  for  example,  {FK92,  FCK95,  Gui95, 
GI96,  MD94,  HDT95,  N095,  SCFY96,  vSvdM94, 
YCS97]).  In  RBAC  permissions  are  associated  with 
roles,  and  users  are  made  members  of  appropriate  roles 
thereby  acquiring  the  roles’  permissions.  This  greatly 
simplifies  management  of  permissions.  Roles  are  cre¬ 
ated  for  the  various  job  functions  in  an  organization 
and  users  are  assigned  roles  based  on  their  responsibil¬ 
ities  and  qualifications.  Users  can  be  easily  reassigned 
from  one  role  to  another.  Roles  can  be  granted  new 
permissions  as  new  applications  and  systems  are  incor¬ 
porated,  and  permissions  can  be  revoked  from  roles  as 
needed.  Role-role  relationships  can  be  established  to 
lay  out  broad  policy  objectives. 

In  large  enterprise- wide  systems  the  number  of  roles 
can  be  in  the  hundreds  or  thousands,  and  users  can 
be  in  the  tens  or  hundreds  of  thousands,  maybe  even 


millions.  Managing  these  roles  and  users,  and  their 
interrelationships  is  a  formidable  task  that  often  is 
highly  centralized  and  delegated  to  a  small  team  of 
security  administrators.  Because  the  main  advantage 
of  RBAC  is  to  facilitate  administration  of  permissions, 
it  is  naturcil  to  ask  how  RBAC  itself  can  be  used  to 
manage  RBAC.  We  believe  the  use  of  RBAC  for  man¬ 
aging  RBAC  will  be  an  important  factor  in  the  long¬ 
term  success  of  RBAC.  Decentralizing  the  details  of 
RBAC  administration  without  loosing  central  control 
over  broad  policy  is  a  challenging  goal  for  system  de¬ 
signers  and  architects. 

As  we  will  see  there  are  many  components  to 
RBAC.  RBAC  administration  is  therefore  multi¬ 
faceted.  In  particular  we  can  separate  the  issues  of 
assigning  users  to  roles,  assigning  permissions  to  roles, 
and  assigning  roles  to  roles  to  define  a  role  hiercir- 
chy.  These  activities  are  all  required  to  bring  users 
and  permissions  together.  However,  in  many  cases, 
they  are  best  done  by  different  administrators  (or  ad¬ 
ministrative  roles).  Assigning  permissions  to  roles  is 
typically  the  province  of  application  administrators. 
Thus  a  banking  application  can  be  implemented  so 
credit  and  debit  operations  are  assigned  to  a  teller  role, 
whereas  approval  of  a  loan  is  assigned  to  a  manage¬ 
rial  role.  Assignment  of  actual  individuals  to  the  teller 
and  managerial  roles  is  a  personnel  management  func¬ 
tion.  Assigning  roles  to  roles  has  aspects  of  user-role 
assignment  and  role-permission  assignment.  Role-role 
relationships  establish  broad  policy.  Control  of  these 
relationships  would  typically  be  relatively  centralized 
in  the  hands  of  a  few  security  administrators. 

In  this  paper  we  have  focussed  our  attention  exclu¬ 
sively  on  user-role  assignment.  We  recognize  that  a 
comprehensive  administrative  model  for  RBAC  must 
account  for  all  three  issues  mentioned  above,  among 
others.  However,  user-role  assignment  is  a  particularly 
critical  administrative  activity.  We  feel  it  is  the  right 
one  to  focus  on  first. 
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In  large  systems  user-role  assignment  is  likely  to  be 
the  first  administrative  function  that  is  decentralized 
and  delegated  to  users  rather  than  system  administra¬ 
tors.  Assigning  people  to  tasks  is  a  normal  managerial 
function.  Assigning  users  to  roles  should  be  a  natural 
part  of  assigning  users  to  tasks.  Empowering  man¬ 
agers  to  do  this  routinely  is  one  vray  of  making  secu¬ 
rity  an  enabling  user-friendly  technology  rather  than 
an  intrusive  and  cumbersome  nuisance  as  it  all  too  of¬ 
ten  turns  out  to  be.  A  manager  who  can  assign  a  user 
to  perform  certain  tasks  should  not  have  to  ask  some¬ 
one  else  to  enroll  this  user  in  appropriate  roles.  This 
should  happen  transparently  and  conveniently. 

A  user-role  assignment  model  can  also  be  used  for 
managing  user-group  assignment  and  therefore  has 
applicability  beyond  RBAC.  The  difference  between 
roles  and  groups  was  hotly  debated  at  the  First  ACM 
Workshop  on  RBAC  [San97b].  Workshop  attendees 
arrived  at  the  consensus  that  a  group  is  a  named  col¬ 
lection  of  users  (and  possibly  other  groups).  Groups 
serve  as  a  convenient  shorthand  notation  for  collec¬ 
tions  of  users  and  that  is  the  main  motivation  for  in¬ 
troducing  them.  Roles  are  similar  to  groups  in  that 
they  can  serve  as  a  shorthand  for  collections  of  users, 
but  they  go  beyond  groups  in  also  serving  as  a  short¬ 
hand  for  a  collection  of  permissions.  Assigning  users 
to  roles  or  users  to  groups  are  therefore  essentially 
the  same  function.  Assigning  permissions  to  roles  and 
permissions  to  groups,  on  the  other  hand,  can  have 
rather  different  characteristics.  We  need  not  get  into 
this  latter  issue  here  since  our  focus  is  on  user-role,  or 
equivalently  user-group,  assignment. 

In  this  paper  we  propose  a  model  for  the  assign¬ 
ment  of  users  to  roles  by  means  of  administrative  roles 
and  permissions.  We  call  our  model  URA97  (user-role 
assignment  ’97).  URA97  imposes  strict  limits  on  in¬ 
dividual  administrators  regarding  which  users  can  be 
assigned  to  which  roles.  We  then  describe  an  imple¬ 
mentation  of  URA97  in  the  Oracle  database  manage¬ 
ment  system  [KL95,  Feu95].  Oracle’s  administrative 
model  for  user-role  assignment  is  very  different  from 
URA97.  Nevertheless,  we  show  how  to  use  Oracle’s 
stored  procedures  to  implement  URA97. 

The  principal  contribution  of  URA97  is  to  pro¬ 
vide  a  concrete  example  of  what  is  meant  by  role- 
based  administration  of  user-role  assignment.  An¬ 
other  central  contribution  of  this  paper  is  to  demon¬ 
strate  that  an  existing  popular  product,  namely  Ora¬ 
cle,  provides  the  necessary  base  mechanisms  and  ex¬ 
tensibility  to  program  the  behavior  of  URA97.  URA97 
is  defined  in  context  of  the  family  of  RBAC96  fam¬ 
ily  of  models  due  to  Sandhu  et  al  [SCFY96].  How¬ 


ever,  it  applies  to  almost  any  RBAC  model,  includ¬ 
ing  [FCK95,  Gui95,  GI96,  HDT95,  N095],  because 
user-role  assignment  is  a  basic  administrative  feature 
which  will  be  required  in  any  RBAC  model. 

The  rest  of  this  paper  is  organized  as  follows.  We 
begin  by  reviewing  the  RBAC96  family  of  models  in 
section  2.  In  section  3  we  define  the  administrative 
model  called  URA97  for  user-role  assignment  which 
itself  is  role-based.  This  is  followed  by  a  quick  review 
of  relevant  RBAC  features  of  Oracle  in  section  4.  Our 
implementation  of  URA97  in  Oracle  is  described  in 
section  5.  Section  6  concludes  the  paper. 


2  THE  RBAC96  MODELS 

A  general  family  of  RBAC  models  called  RBAC96 
was  defined  by  Sandhu  et  al  [SCFY96].  Figure  1  il¬ 
lustrates  the  most  general  model  in  this  family.  For 
simplicity  we  use  the  term  RBAC96  to  refer  to  the 
family  of  models  as  well  as  its  most  general  member. 

The  top  half  of  figure  1  shows  (regular)  roles  and 
permissions  that  regulate  access  to  data  and  resources. 
The  bottom  half  shows  administrative  roles  and  per¬ 
missions.  Intuitively,  a  user  is  a  human  being  or  an 
autonomous  agent,  a  role  is  a  job  function  or  job  title 
within  the  organization  with  some  associated  seman¬ 
tics  regarding  the  authority  and  responsibility  con¬ 
ferred  on  a  member  of  the  role,  and  a  permission  is 
an  approval  of  a  particular  mode  of  access  to  one  or 
more  objects  in  the  system  or  some  privilege  to  carry 
out  specified  actions.  Roles  are  organized  in  a  partial 
order  >,  so  that  if  x  >y  then  role  x  inherits  the  per¬ 
missions  of  role  y.  Members  of  x  are  also  implicitly 
members  of  y.  In  such  cases,  we  say  x  is  senior  to  y. 
Each  session  relates  one  user  to  possibly  many  roles. 
The  idea  is  that  a  user  establishes  a  session  and  acti¬ 
vates  some  subset  of  roles  that  he  or  she  is  a  member  of 
(directly  or  indirectly  by  means  of  the  role  hierarchy). 

Motivation  and  discussion  about  various  design 
decisions  made  in  developing  this  family  of  models 
is  given  in  [SCFY96,  San97a].  It  is  worth  empha¬ 
sizing  that  RBAC96  distinguishes  roles  and  permis¬ 
sions  from  administrative  roles  and  permissions  re¬ 
spectively,  where  the  latter  are  used  to  manage  the 
former.  How  are  administrative  permissions  and  roles 
managed  in  turn?  One  could  consider  a  second  level 
of  administrative  roles  and  permissions  to  manage  the 
first  level  ones  and  so  on.  We  feel  such  a  progression 
of  administration  is  unnecessary.  Administration  of 
administrative  roles  and  permissions  is  under  control 
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RH 


ROLE 

HIERARCHY 


ASSIGNMENT 


•  t/,  a  set  of  users 

R  and  .4/2,  disjoint  sets  of  (regular)  roles  and  administrative  roles 
P  and  -4P,  disjoint  sets  of  (regular)  permissions  and  administrative  permissions 
5,  a  set  of  sessions 

•  U  A  CU  X  R,  user  to  role  assignment  relation 

AU  tin  A  CU  X  AR,  user  to  administrative  role  assignment  relation 

•  PA  C  P  X  R,  permission  to  role  assignment  relation 

AP A  C  AP  X  ARj  permission  to  administrative  role  assignment  relation 

•  RH  C  Rx  R,  partially  ordered  role  hierarchy 

ARH  C  AR  X  AR,  partially  ordered  administrative  role  hieraurchy 
(both  hierarchies  are  written  as  >  in  infix  notation) 

•  user  :  S  U,  maps  each  session  to  a  single  user  (which  does  not  change) 

roles  :  S  2^^^^  maps  each  session  Si  to  a  set  of  roles  and  administrative  roles  roles{si)  C  {r  |  (3r'  > 
r)[(user(sj),r')  e  UAU  AUtinA]}  (which  can  change  with  time) 

session  Sj  has  the  permissions  Ureroie8{si){p  I  (3r"  <  r)[(p,r")  £  PAu  APA]} 

•  there  is  a  collection  of  constraints  stipulating  which  values  of  the  various  components  enumerated  above  are 
allowed  or  forbidden. 


Figure  1:  Summary  of  the  RBAC96  Model 
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of  the  chief  security  officer  or  delegated  in  part  to  ad- 
minis t rati ve  roles. 


3  THE  URA97  MODEL 

RBAC  has  many  components  as  described  in  the 
previous  section.  Administration  of  RBAC  involves 
control  over  each  of  these  components  including  cre¬ 
ation  and  deletion  of  roles,  creation  and  deletion  of 
permissions,  assignment  of  permissions  to  roles  and 
their  removal,  creation  and  deletion  of  users,  assign¬ 
ment  of  users  to  roles  and  their  removal,  definition 
and  maintainence  of  the  role  hierarchy,  definition  and 
maintainence  of  constraints  and  all  of  these  in  turn  for 
administrative  roles  and  permissions.  A  comprehen¬ 
sive  administrative  model  would  be  quite  complex  and 
difficult  to  develop  in  a  single  step. 

Fortunately  administration  of  RBAC  can  be  par¬ 
titioned  into  several  areas  for  which  administrative 
models  can  be  separately  zind  independently  devel¬ 
oped  to  be  later  integrated.  In  particular  we  can  sep¬ 
arate  the  issues  of  assigning  users  to  roles,  assigning 
permissions  to  roles  and  defining  the  role  hierarchy. 
In  many  cases,  these  activities  would  be  best  done 
b\  different  administrators.  Assigning  permissions  to 
roles  is  typically  the  province  of  application  admin¬ 
istrators.  Thus  a  banking  application  can  be  imple¬ 
mented  so  credit  and  debit  operations  are  assigned  to 
a  teller  role,  whereas  approval  of  a  loan  is  assigned  to 
a  managerial  role.  Assignment  of  actual  individucils  to 
the  teller  and  managerial  roles  is  a  personnel  manage¬ 
ment  function.  Design  of  the  role  hierarchy  relates  to 
design  of  the  organizational  structure  and  is  the  func¬ 
tion  of  a  chief  security  officer  under  guidance  of  a  chief 
information  officer. 

In  this  paper  our  focus  is  exclusively  on  user-role 
assignment.  As  discussed  in  section  1  this  is  likely 
to  the  first  and  most  widely  decentralized  adminis¬ 
trative  task  in  RBAC.  In  the  RBAC96  framework  of 
figure  1  control  of  17 A  is  vested  in  the  administrative 
roles  AH.  For  simplicity  we  limit  our  scope  to  assign¬ 
ment  of  users  to  regular  roles.  Assignment  of  users  to 
administrative  roles  is  centralized  under  the  chief  se¬ 
curity  officer.  In  general  the  chief  security  officer  has 
complete  control  over  all  aspects  of  RBAC96. 

In  the  rest  of  this  section  we  develop  a  model  called 
URA97  in  which  RBAC  is  used  to  manage  user-role  as¬ 
signment.  We  define  URA97  in  two  steps  dealing  with 
granting  a  user  membership  in  a  role  and  revoking  a 
user’s  membership.  URA97  is  deliberately  designed 


to  have  a  very  narrow  scope.  For  example  creation 
of  users  and  roles  is  outside  its  scope.  In  spite  of  its 
simplicity  URA97  is  quite  powerful  and  goes  much  be¬ 
yond  existing  administrative  models  for  user-role  as¬ 
signment,  such  as  the  one  implemented  in  Oracle.  It 
is  also  applicable  beyond  RBAC  to  user-group  assign¬ 
ment. 

3.1  URA97  Grant  Model 

In  the  simplest  case  user-role  assignment  can  be 
completely  centralized  in  a  single  chief  security  officer 
role.  This  is  readily  implemented  in  existing  systems 
such  as  Oracle.  However,  this  simple  approach  does 
not  scale  to  large  systems.  Clearly  it  is  desirable  to 
decentralize  user-role  assignment  to  some  degree. 

In  several  systems,  including  Oracle,  it  is  possible 
to  designate  a  role,  say,  junior  security  officer  (JSO) 
whose  members  have  administrative  control  over  one 
or  more  regular  roles,  say.  A,  B  and  C.  Thus  limited 
administrative  authority  is  delegated  to  the  JSO  role. 
Unfortunately  these  systems  typically  allow  the  JSO 
role  to  have  complete  control  over  roles  A,  B  and  C. 

A  member  of  JSO  can  not  only  add  users  to  A,  B  and 
C  but  also  delete  users  from  these  roles  and  add  and 
delete  permissions.  Moreover,  there  is  no  control  on 
which  users  can  be  added  to  the  A,  B  and  C  roles  by 
JSO  members.  Finally,  JSO  members  are  allowed  to 
assign  A,  B  and  C  as  junior  to  any  role  in  the  existing 
hierarchy  (so  long  as  this  does  not  introduce  a  cycle). 
All  this  is  consistent  with  classical  discretionary  think¬ 
ing  whereby  member  of  JSO  are  effectively  designated 
as  owners  of  the  A,  B  and  C  roles,  and  therefore  are 
free  to  do  whatever  they  want  to  these  roles. 

URA97  our  goal  is  to  impose  restrictions  on 
which  users  can  be  added  to  a  role  by  whom,  as  well 
as  to  clearly  separate  the  ability  to  add  and  remove 
users  from  other  operations  on  the  role.  The  notion 
of  a  prerequisite  condition  is  a  key  part  of  URA97. 

Definition  1  A  prerequisite  condition  is  a 
boolean  expression  using  the  usual  A  and  V  oper¬ 
ators  on  terms  of  the  form  x  and  i  where  x  is  a 
regular  role  (i.e.,  x  e  R).  A  prerequisite  condition 
is  evaluated  for  a  user  u  by  interpreting  x  to  be 
true  if  (3x  >  x)(u,x')  E  UA  and  x  to  be  true  if 
(Vx'  >  x)(u,x')  ^  UA.  For  a  given  set  of  roles  R  let 
CR  denotes  all  possible  prerequisite  conditions  that 
can  be  formed  using  the  roles  in  R.  □ 

In  the  trivial  case  a  prerequisite  condition  can  be  a 
tautology  which  is  always  true.  The  simplest  non- 
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Figure'  3:  An  Example  Administrative  Role  Hierarchy 


trivial  case  of  a  prerequisite  condition  is  test  for  mem¬ 
bership  in  a  single  role,  in  which  situation  that  single 
role  is  called  a  prerequisite  role. 

User-role  assignment  is  authorized  in  URA97  by  the 
following  relation. 

Definition  2  The  URA97  model  controls  user-role 
assignment  by  means  of  the  relation  can-assign  C 
AR  xCRx  2^.  □ 

The  meaning  of  can-a5Sf^n(a:,  j/,  {a,6,c})  is  that  a 
member  of  the  administrative  role  x  (or  a  member  of 
an  administrative  role  that  is  senior  to  x)  can  assign  a 
user  whose  current  membership,  or  non-membership, 
in  regular  roles  satisfies  the  prerequisite  condition  y  to 
be  a  member  of  regular  roles  a,  6  or  c.^ 

To  appreciate  the  motivation  behind  the  can-assign 
relation  consider  the  role  hierarchy  of  figure  2  and 
the  administrative  role  hierarchy  of  figure  3.  Figure  2 

^  User- role  assignment  is  subject  to  constraints,  such  as  mu¬ 
tually  exclusive  roles  or  maximum  cardinality,  that  may  be  im¬ 
posed.  The  assignment  will  succeed  if  and  only  if  it  is  authorized 
by  can-asstgn  and  it  satisfies  all  relevant  constraints. 


shows  the  regular  roles  that  exist  in  a  engineering  de¬ 
partment.  There  is  a  junior-most  role  E  to  which  all 
employees  in  the  organization  belong.  Within  the  en¬ 
gineering  department  there  is  a  junior-most  role  ED 
and  senior-most  role  DIR.  In  between  there  are  roles 
for  two  projects  within  the  department,  project  1  on 
the  left  and  project  2  on  the  right.  Each  project  has 
a  senior-most  project  lead  role  (PLl  and  PL2)  and  a 
junior-most  engineer  role  (El  and  E2).  In  between 
each  project  has  two  incomparable  roles,  production 
engineer  (PEI  and  PE2)  and  quality  engineer  (QEl 
and  QE2). 

Figure  2  suffices  for  our  purpose  but  this  structure 
can,  of  course,  be  extended  to  dozens  and  even  hun¬ 
dreds  of  projects  within  the  engineering  department. 
Moreover,  each  project  could  have  a  different  struc¬ 
ture  for  its  roles.  The  example  Ccui  also  be  extended 
to  multiple  departments  with  diflFerent  structure  and 
policies  applied  to  each  department. 

Figure  3  shows  the  administrative  role  hierarchy 
which  co-exists  with  figure  2.  The  senior-most  role 
is  the  senior  security  officer  (SSO).  Our  main  interest 
is  in  the  administrative  roles  junior  to  SSO.  These 
consist  of  two  project  security  officer  roles  (PSOl  and 
PS02)  and  a  department  security  officer  (DSO)  role 
with  the  relationships  illustrated  in  the  figure. 


3.1.1  Prerequisite  Roles 

For  sake  of  illustration  we  define  the  can-assign  rela¬ 
tion  shown  in  table  1(a).  This  example  has  the  sim¬ 
plest  prerequisite  condition  of  testing  membership  in 
a  single  role  known  as  the  prerequisite  role. 

The  PSOl  role  has  partial  responsibility  over 
project  1  roles.  Let  Alice  be  a  member  of  the  PSOl 
role  and  Bob  a  member  of  the  ED  role.  Alice  can  as¬ 
sign  Bob  to  any  of  the  El,  PEI  and  QEl  roles,  but  not 
to  the  PLl  role.  Also  if  Charlie  is  not  a  member  of  the 
ED  role,  then  Alice  cannot  assign  him  to  any  project 
1  role.  Hence,  Alice  has  authority  to  enroll  users  in 
the  El,  PEI  and  QEl  roles  provided  these  users  are 
already  members  of  ED.  Note  that  if  Alice  assigns  Bob 
to  PEI  he  does  not  need  to  be  explicitly  assigned  to 
El,  since  El  permissions  will  be  inherited  via  the  role 
hierarchy.  The  PS02  role  is  similar  to  PSOl  but  with 
respect  to  project  2.  The  DSO  role  inherits  the  au¬ 
thority  of  PSOl  and  PS02  roles  but  can  further  add 
users  who  are  members  of  ED  to  the  PLl  and  PL2 
roles.  The  SSO  role  can  add  users  who  are  in  the  E 
role  to  the  ED  role,  as  well  as  add  users  who  are  in 
the  ED  role  to  the  DIR  role.  This  ensures  that  even 
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Admin.  Role 

Prereq.  Role 

Role  Set 

PSOl 

ED 

{El,  PEI,  QEl} 

PS02 

ED 

{E2,  PE2,  QE2} 

DSO 

ED 

(PLl,  PL2} 

SSO 

E 

{ED} 

SSO 

ED 

{DIR} 

(a)  Subset  Notation 


Admin.  Role 

Prereq.  Role 

Role  Range 

PSOl 

ED 

[El,  PLl) 

PS02 

ED 

[E2,  PL2) 

DSO 

ED 

(ED,  DIR) 

SSO 

E 

[ED,  ED] 

SSO 

ED 

(ED,  DIR] 

(b)  Range  Notation 


Table  1:  can-assign  with  Prerequisite  Roles 


the  SSO  must  first  enroll  a  user  in  the  ED  role  before 
that  user  is  enrolled  in  a  role  senior  to  ED.  This  is  a 
reasonable  specification  for  can-assign.  There  are,  of 
course,  lots  of  other  equally  reasonable  specifications 
in  this  context.  This  is  a  matter  of  policy  decision  and 
our  model  provides  the  necessary  flexibility. 

In  general,  one  would  expect  that  the  role  being 
assigned  is  senior  to  the  role  previously  required  of 
the  user.  That  is,  if  we  have  can-assign(a,  b,  C)  then 
b  is  junior  to  all  roles  c  €  C .  We  believe  this  will 
usually  be  the  case,  but  we  do  not  require  it  in  the 
model.  This  allows  URA97  to  be  applicable  to  situa¬ 
tions  where  there  is  no  role  hierarchy  or  where  such  a 
constraint  may  not  be  appropriate. 

The  notation  of  table  1(a)  has  benefited  from  the 
administrative  role  hierarchy.  Thus  for  the  DSO  we 
have  specified  the  role  set  as  {PLl,  PL2}  and  the  other 
values  are  inherited  from  PSOl  and  PS02.  Similarly 
for  the  SSO.  Nevertheless  explicit  enumeration  of  the 
role  set  is  unwieldy,  particularly  if  we  were  to  scale 
up  to  dozens  or  hundreds  of  projects  in  the  depart¬ 
ment.  Moreover,  explicit  enumeration  is  not  resilient 
with  respect  to  changes  in  the  role  hiertirchy.  Suppose 
a  third  project  is  introduced  in  the  department,  with 
roles  E3,  PE3,  QE3,  PL3  and  PS03  analogous  to  cor¬ 
responding  roles  for  projects  1  and  2.  We  can  add  the 
following  row  to  table  1(a). 


Admin.  Role 

Prereq.  Role 

Role  Set 

PS03 

ED 

{E3.  PE3,  QE3} 

This  is  a  reasonable  change  to  require  when  the  new 
project  and  its  roles  are  introduced  into  the  regular 
and  administrative  role  hierarchies.  However,  we  also 
need  to  modify  the  row  for  DSO  in  table  1(b)  to  in¬ 
clude  PL3. 

3.1.2  Range  Notation 

Consider  instead  the  range  notation  illustrated  in  ta¬ 
ble  1(b).  Table  1(b)  shows  the  same  role  sets  as  ta¬ 
ble  1(a)  but  defines  these  sets  by  identifying  a  range 
within  the  role  hierarchy  of  figure  1(a)  by  means  of 
the  familiar  closed  and  open  interval  notation. 

Definition  3  Role  sets  are  specified  in  the  URA97 
model  by  the  notation  below 

[x,  j/]  =  {r  e  R  I  X  >  r  A  r  >  2/} 

\x,y\  =  {r  e  R  1  X  >  r  A  r  >  j/} 

y)  =  {r  €  R  j  X  >  r  A  r  >  j^} 

\^^y)  =  {r  e  R  1  X  >  r  A  r  >  3/} 

□ 

This  notation  is  resilient  to  modifications  in  the  role 
hierarchy  such  as  addition  of  a  third  project  which 
requires  addition  of  the  following  row  to  table  1(b). 


Admin.  Role 

Prereq,  Role 

Role  Range  | 

PS03 

ED 

[E3,PL3)  I 

No  other  change  is  required  since  the  [ED,  DIR)  rtinge 
specified  for  the  DSO  will  automatically  pick  up  PL3. 

The  range  notation  is,  of  course,  not  resilient  to  all 
changes  in  the  role  hierarchy.  Deletion  of  one  of  the 
end  points  of  a  range  can  leave  a  dangling  reference 
and  an  invalid  range.  Standcird  techniques  for  ensur¬ 
ing  referential  integrity  would  need  to  be  applied  when 
modifying  the  range  hierarchy.  Changes  to  role-role 
relationships  could  also  cause  a  range  to  be  drasti¬ 
cally  different  from  its  original  meaning.  Nevertheless 
the  range  notation  is  much  more  convenient  than  ex¬ 
plicit  enumeration.  There  is  also  no  loss  of  generality 
in  adopting  the  range  notation  since  every  set  of  roles 
can  be  expressed  as  a  union  of  disjoint  ranges. 

Strictly  speaking  the  two  specifications  of  table  1(a) 
and  1(b)  are  not  precisely  identical.  In  table  1(a)  the 
DSO  role  is  explicitly  authorized  to  enroll  users  in  PLl 
&nd  PL2,  and  inherits  the  ability  to  enroll  users  in 
other  project  1  and  2  roles  from  PSOl  and  PS02.  On 
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Admin.  Role 

Prereq.  Condition 

Role  Range 

PSOl 

ED 

[El,  El] 

PSOl 

ED  A  QEl 

[PEI,  PEI] 

PSOl 

ED  A  PEI 

[QEl,  QEl] 

PSOl 

PEI  A  QEl 

[PLl,  PLl] 

PS02 

ED 

[E2,  E2] 

PS02 

ED  A  QE2 

[PE2,  PE2] 

PS02 

ED  A  PE2 

[QE2,  QE2] 

PS02 

PE2  A  QE2 

[PL2,  PL2] 

DSO 

ED 

(ED,  DIR) 

SSO 

E 

[ED,  ED] 

SSO 

ED 

(ED,  DIR] 

Table  2:  can- assign  with  Prerequisite  Conditions 


the  other  hand,  in  table  1(b)  the  DSO  role  is  explicitly 
authorized  to  enroll  users  in  all  project  1  and  2  roles. 
As  it  stands  the  net  effect  is  the  same.  However,  if 
modifications  are  made  to  the  role  hierarchy  or  to  the 
PSOl  or  PS02  authorizations  the  effect  can  be  dif¬ 
ferent.  The  DSO  authorization  in  table  1(a)  can  be 
replaced  by  the  following  row  to  make  table  1(a)  more 
nearly  identical  to  table  1(b). 


.Admin.  Role 

Prereq.  Role 

Role  Set 

DSO 

ED 

{El,  PEI,  QEl,  PLl, 
E2,  PE2,  QE2,  PL2} 

Now  even  if  the  PSOl  and  PS02  roles  of  table  1(a)  are 
modified  respectively  to  the  role  sets  {El}  and  {E2}, 
the  DSO  role  will  still  retain  administrative  author¬ 
ity  over  all  project  1  and  project  2  roles.  Of  course, 
explicit  and  implicit  specifications  will  never  behave 
exactly  identically  under  all  circumstances.  For  in¬ 
stance.  introduction  of  a  new  project  3  will  exhibit 
differences  as  discussed  above.  Conversely,  the  DSO 
authorization  in  table  1(b)  can  be  replaced  by  the  fol¬ 
lowing  rows  to  make  table  1(b)  more  nearly  identical 
to  table  1(a). 


Admin.  Role 

Prereq.  Role 

Role  Range 

DSO 

ED 

[PLl,  PLl] 

DSO 

ED 

[PL2,  PL2] 

There  is  an  analogous  situation  with  the  SSO  role 
in  tables  1(a)  and  1(b).  Cle2u:ly,  we  must  antici¬ 
pate  the  impact  of  future  changes  when  we  specify 
the  can-assign  relation. 


3.1.3  Prerequisite  Conditions 

An  example  of  can- assign  which  uses  prerequisite  con¬ 
ditions  rather  than  prerequisite  roles  is  shown  in  ta¬ 
ble  2.  The  authorizations  for  PSOl  and  PS02  have 
been  changed  relative  to  table  1. 

Let  us  consider  the  PSOl  tuples  (analysis  for  PS02 
is  exactly  similar).  The  first  tuple  authorizes  PSOl  to 
assign  users  with  prerequisite  role  ED  into  El.  The 
second  one  authorizes  PSOl  to  assign  users  with  pre¬ 
requisite  condition  ED  A  QEl  to  PEI.  Similarly,  the 
third  tuple  authorizes  PSOl  to  assign  users  with  pre¬ 
requisite  condition  ED  A  PEI  to  QEl.  Taken  together 
the  second  and  third  tuples  authorize  PSOl  to  put  a 
user  who  is  a  member  of  ED  into  one  but  not  both  of 
PEI  and  QEl.  This  illustrates  how  mutually  exclusive 
roles  can  be  enforced  by  URA97.  PEI  and  QEl  are 
mutually  exclusive  with  respect  to  the  power  of  PSOl. 
However,  for  the  DSO  and  SSO  these  are  not  mutually 
exclusive.  Hence,  the  notion  of  mutucd  exclusion  is  a 
relative  one  in  URA97.  The  fourth  tuple  authorizes 
PSOl  to  put  a  user  who  is  a  member  of  both  PEI  and 
QEl  into  PLl.  Of  course,  a  user  could  have  become 
a  member  of  both  PEI  and  QEl  only  by  actions  of  a 
more  powerful  administrator  than  PSOl. 

3.2  URA97  Revoke  Model 

We  now  turn  to  consideration  of  the  URA97  re¬ 
voke  model.  The  objective  is  to  define  a  revoke  model 
that  is  consistent  with  the  philosophy  of  RBAC,  This 
causes  us  to  depart  from  classical  discretionary  ap)- 
proaches  to  revocation. 

In  the  classical  discretionary  approach  to  revoca¬ 
tion  there  are  at  least  two  issues  that  introduce  com¬ 
plexity  and  subtlety  [GW76,  Fag78].  Suppose  Alice 
grants  Bob  some  permission  P.  This  is  done  at  Al¬ 
ice’s  discretion  because  Alice  is  either  the  owner  of 
the  object  to  which  P  pertains  or  has  been  granted 
administrative  authority  on  P  by  the  actual  owner. 
Alice  can  later  revoke  P  from  Bob.  Now  suppose 
Bob  has  received  permission  P  from  Alice  and  from 
Charlie.  If  Alice  revokes  her  grant  of  P  to  Bob  he 
should  still  continue  to  retain  P  because  of  Charlie’s 
grant.  A  related  issue  is  that  of  cascading  revokes. 
Suppose  Charlie’s  grant  was  in  turn  obtained  from 
Alice,  perhaps  Bob’s  permission  should  end  up  being 
revoked  by  Alice’s  action.  Or  perhaps  it  should  not, 
because  Alice  only  revoked  her  direct  grant  to  Bob  but 
not  the  indirect  one  via  Charlie  which  really  occurred 
at  Charlie’s  discretion.  A  considerable  literature  has 
developed  examining  the  subtleties  that  arise,  espe- 
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cially  when  hierarchical  groups  and  negative  permis¬ 
sions  or  denials  are  brought  into  play  (see,  for  example, 
[Lun88,  BSJ93,  FWF95,  GSF91,  RBKW91]). 

The  RBAC  approach  to  authorization  is  quite  dif¬ 
ferent  from  the  traditional  discretionary  one.  In 
RBAC  users  are  made  members  of  roles  because  of 
their  job  function  or  task  assignment  in  the  inter¬ 
est  of  the  organization.  Granting  of  membership  in 
a  role  is  specifically  not  done  at  the  grantor’s  whim. 
Suppose  Alice  makes  Bob  a  member  of  a  role  X.  In 
URA97  this  happens  because  Alice  is  assigned  suit¬ 
able  administrative  authority  over  X  via  some  admin¬ 
istrative  role  Y  and  Bob  is  eligible  for  membership 
in  X  due  to  Bob’s  existing  role  memberships  (and 
non-memberships)  satisfying  the  prerequisite  condi¬ 
tion.  Moreover,  there  are  some  organizational  circum¬ 
stances  which  cause  Alice  to  grant  Bob  this  member¬ 
ship.  It  is  not  merely  being  done  at  Alice’s  personal 
fancy.  Now  if  at  some  later  time  Alice  is  removed  from 
the  administrative  role  Y  there  is  clearly  no  reason  to 
also  remove  Bob  from  X.  A  change  in  Alice’s  job  func¬ 
tion  should  not  necessarily  undo  her  previous  grants. 
Presumably  some  other  administrator,  say  Dorothy, 
will  take  over  .41ice’s  responsibility.  Similarly,  suppose 
.41ire  and  Charlie  both  grant  membership  to  Bob  in 
X.  .At  some  later  time  Bob  is  reassigned  to  some  other 
[irojert  and  no  longer  needs  to  be  a  member  of  role  X. 

It  i.';  not  material  whether  Alice  or  Charlie  or  both  or 
Dcjrothv  revokes  Bob’s  membership.  Bob’s  member- 
shij)  in  X  i.s  being  revoked  due  to  a  change  in  organi¬ 
zational  circumstances. 

To  summarize,  in  classical  discretionary  access  con¬ 
trol  the  source  (direct  or  indirect)  of  a  permission  and 
the  identity  of  the  revoker  is  typically  taken  into  ac¬ 
count  in  interpreting  the  revoke  operation.^  These 
issues  do  not  arise  in  the  same  way  for  revocation  of 
user-role  assignment  in  RBAC.  However,  there  are  re¬ 
lated  subtleties  that  arise  in  RBAC  concerning  the  in¬ 
teraction  between  granting  and  revocation  of  user-role 
membership  and  the  role  hierarchy.  We  will  illustrate 
these  in  a  moment. 

3.2.1  The  Can-Revoke  Relation 

l\e  now  introduce  our  notation  for  authorizing  revo¬ 
cation. 

Definition  4  The  URA97  model  controls  user-role 
revocation  by  means  of  the  relation  can-revoke  C 

-This  is  true  more  in  theory  than  practice,  because  many 
commercial  products  opt  for  a  simpler  semantics  than  implied 
by  a  strict  owner-based  discretionary  viewpoint. 


AR  X  2«.  □ 

The  meaning  of  can-revoke{x,Y)  is  that  a  member  of 
the  administrative  role  x  (or  a  member  of  an  adminis¬ 
trative  role  that  is  senior  to  x)  can  revoke  membership 
of  a  user  from  any  regular  role  y  eY.  Y  is  specified 
using  the  range  notation  of  definition  3.  We  say  Y 
defines  the  range  of  revocation. 

3.2.2  Weak  Revocation 

The  revocation  operation  in  URA97  is  said  to  be  weak 
because  it  applies  only  to  the  role  that  is  directly  re¬ 
voked.  Suppose  Bob  is  a  member  of  PEI  and  El.  If 
Alice  revokes  Bob’s  membership  from  El,  he  continues 
to  be  a  member  of  the  senior  role  PEI  and  therefore 
can  use  the  permissions  of  El. 

To  make  the  notion  of  weak  revocation  precise  we 
introduce  the  following  terminology.  Recall  that  UA 
is  the  user  assignment  relation. 

Definition  5  Let  us  say  a  user  Cf  is  tm  explicit  mem¬ 
ber  of  role  X  if  (17,  x)  e  UA,  and  that  U  is  an  implicit 
member  of  role  x  if  for  some  x'  >  x,  (i7,x')  €  UA.  □ 

Note  that  a  user  can  simultaneously  be  an  explicit  and 
implicit  member  of  a  role, 

Weak  revocation  has  an  impact  only  on  explicit 
membership.  It  has  the  straightforward  meaning 
stated  below. 

Definition  6  [Weak  Revocation  Algorithm] 

1.  Let  Alice  have  a  session  with  administrative  roles 
A  =  {ci, 02, . . .  jOjj.},  and  let  Alice  try  to  weakly 
revoke  Bob  from  role  x. 

2.  If  Bob  is  not  an  explicit  member  of  x  this  opera¬ 
tion  has  no  effect,  otherwise  there  are  two  cases. 

(a)  There  exists  a  can-revoke  tuple  (6,  Y)  such 
that  there  exists  o,  €  A,  o^  >  6  and  x  €  K . 

In  this  case  Bob’s  explicit  membership  in  x 
is  revoked. 

(b)  There  does  not  exist  a  can-revoke  tuple  as 
identified  above. 

In  this  case  the  weak  revoke  operation  has 
no  effect. 

□ 
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Admin.  Role 

Role  Range 

PSOl 

[El,  PLl) 

PS02 

[E2,  PL2) 

DSO 

(ED,  DIR) 

SSO 

[ED,  DIR] 

Table  3:  Example  of  can-revoke 


3.2.3  Strong  Revocation 

Strong  revocation  in  URA97  requires  revocation  of 
both  explicit  and  implicit  membership.  Strong  re¬ 
vocation  of  U’s  membership  in  x  requires  that  U  be 
removed  not  only  from  explicit  membership  in  x,  but 
also  from  explicit  (or  implicit)  membership  in  all  roles 
senior  to  x.  Strong  revocation  therefore  has  a  cas¬ 
cading  effect  upwards  in  the  role  hierarchy.  However, 
strong  revocation  in  URA97  takes  effect  only  if  all 
implied  revocations  upward  in  the  role  hierarchy  are 
within  the  revocation  range  of  the  administrative  roles 
that  are  active  in  a  session. 

In  other  words  strong  revocation  is  equivalent  to  a 
series  of  weak  revocations.  Although  it  is  theoretically 
redundant,  strong  revocation  is  a  useful  and  conve¬ 
nient  operation  for  administrators.  It  is  much  better 
for  the  system  to  figure  out  what  weak  revocations 
need  to  be  carried  out  to  achieve  strong  revocation, 
rather  than  leave  it  to  administrators  to  determine 
this. 

Let  us  consider  the  example  of  can-revoke  shown  in 
table  3  and  interpret  it  in  context  of  the  hierarchies  of 
figures  2  and  3.  Let  Alice  be  a  member  of  PSOl,  and 
let  this  be  the  only  administrative  role  she  has.  Alice 
is  authorized  to  strongly  revoke  membership  of  users 
from  roles  El,  PEI  and  QEl.  Table  4(a)  illustrates 
whether  or  not  Alice  can  strongly  revoke  membership 
of  a  user  from  role  El.  The  effect  of  Alice’s  strong  re¬ 
vocation  of  each  of  these  users  from  El  is  shown  in  ta¬ 
ble  4(b),  Alice  is  not  allowed  to  strongly  revoke  Dave 
and  Eve  from  El  because  they  are  members  of  senior 
roles  outside  the  scope  of  Alice’s  revoking  authority.  If 
Alice  was  assigned  to  the  DSO  role  she  could  strongly 
revoke  Dave  from  El  but  still  would  not  be  able  to 
strongly  revoke  Eve’s  membership  in  El,  In  order  to 
strongly  revoke  Eve  from  El,  Alice  needs  to  be  in  the 
SSO  role. 

The  algorithm  for  strong  revocation  is  stated  in 
terms  of  weak  revocation  as  follows. 


Definition  7  [Strong  Revocation  Algorithm] 

1.  Let  Alice  have  a  session  with  administrative  roles 
A  =  {ai,  02, . . . ,  ajfc},  and  let  Alice  try  to  strongly 
revoke  Bob  from  role  x. 

2.  Find  all  roles  y  >  x  and  Bob  is  a  member  of  y. 

3.  Weak  revoke  Bob  from  all  such  y  as  if  Alice  did 
this  weak  revoke. 

4.  If  any  of  the  weak  revokes  fail  then  Alice’s  strong 
revoke  has  no  effect  otherwise  all  weak  revokes 
succeed. 

An  alternate  approach  would  be  to  do  only  those  weak 
revokes  that  succeed  and  ignore  the  rest.  We  de¬ 
cided  to  go  with  a  cleaner  all-or-nothing  semantics  in 
URA97. 

So  far  we  have  looked  at  the  cascading  of  revoca¬ 
tion  upward  in  the  role  hierarchy.  There  is  a  downward 
cascading  effect  that  also  occurs.  Consider  Bob  in  our 
example  who  is  a  member  of  El  and  PEI.  Suppose 
further  that  Bob  is  an  explicit  member  of  PEI  and 
thereby  an  implicit  member  of  El.  What  happens  if 
Alice  revokes  Bob  from  PEI?  If  we  remove  (Bob,  PEI) 
from  the  UA  relation,  Bob’s  implicit  membership  in 
El  will  also  be  removed.  On  the  other  hand  if  Bob  is 
an  explicit  member  of  PEI  and  also  an  explicit  mem¬ 
ber  of  El  then  Alice’s  revocation  of  Bob  from  PEI 
does  not  remove  him  from  El.  The  revoke  operations 
we  have  defined  in  URA97  have  the  following  effect. 

Property  1.  Implicit  membership  in  a  role 
a  is  dependent  on  explicit  membership  in 
some  senior  role  b  >  a.  Therefore  when  ex¬ 
plicit  membership  of  a  user  is  revoked  from 
b,  implicit  membership  is  also  automatically 
revoked  on  junior  role  a  unless  there  is  some 
other  senior  role  c>  a  in  which  the  user  con¬ 
tinues  to  be  an  explicit  member.  (This  will 
require  6  c.) 

Note  that  our  examples  of  can-assign  in  table  1(b) 
and  can-revoke  in  table  3  are  complementary  in  that 
each  administrative  role  has  the  same  range  for  adding 
users  and  removing  users  from  roles.  Although  this 
would  be  a  common  case  we  do  not  impose  it  as  a 
requirement  on  our  model. 

3.3  Summary  of  URA97 

URA97  controls  user-role  assignment  by  means  of 
the  relation  can-assign  C  AR  x  CR  x  2^.  Role 
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User 

El 

PEI 

QEl 

PLl 

DIR 

Alice  can  revoke  user  from  El 

Bob 

Yes 

Yes 

No 

No 

No 

Yes 

Cathy 

Yes 

Yes 

Yes 

No 

No 

Yes 

Dave 

Yes 

Yes 

Yes 

Yes 

No 

No 

Eve 

Yes 

Yes 

Yes 

Yes 

Yes 

No 

(a)  Prior  to  strong  revocation 


User 

El 

PEI 

QEl 

PLl 

DIR 

Alice  revoke  user  from  El 

Bob 

No 

No 

No 

No 

No 

removed  from  El,  PEI 

Cathy 

No 

No 

No 

No 

No 

removed  from  El,  PEI,  QEl 

Dave 

Yes 

Yes 

Yes 

Yes 

Yes 

no  effect 

Eve 

Yes 

Yes 

Yes 

Yes 

Yes 

no  effect 

(b)  After  strong  revocation 


Table  4:  Example  of  Strong  Revocation 


sets  are  specified  using  the  range  notation  of  defini¬ 
tion  3.  Assignment  has  a  simple  behavior  whereby 
can-assign{a,h,C)  authorizes  a  session  with  an  ad¬ 
ministrative  role  >  a  to  enroll  any  user  who  sat¬ 
isfies  the  prerequisite  condition  h  into  any  role  c  e  C. 
The  prerequisite  condition  is  a  boolean  expression  us¬ 
ing  the  usual  A  and  V  operators  on  terms  of  the  form 
X  and  X  respectively  denoting  membership  and  non¬ 
membership  regular  role  x. 

Revocation  is  controlled  in  URA97  by  the  relation 
can-revoke  C  AR  x  2^.  Weak  revocation  applies  only 
to  explicit  membership  in  a  single  role  as  per  the  al¬ 
gorithm  of  definition  6.  Strong  revocation  cascades 
upwards  in  the  role  hierarchy  as  per  the  algorithm  of 
definition  7.  In  both  cases  revocation  cascades  down¬ 
wards  as  noted  in  property  1. 


4  ORACLE  RBAC  FEATURES 

The  Oracle  database  management  system  [KL95, 
Feu95]  provides  support  for  RBAC  including  support 
for  hierarchical  roles.  However,  Oracle  does  not  di¬ 
rectly  support  the  URA97  model.  In  particular,  Or¬ 
acle  has  a  strong  discretionary  flavor  to  its  admin¬ 
istrative  model  for  user-role  assignment  and  revoca¬ 
tion.  Also  the  Oracle  revocation  model  is  similar  to 
our  weak  revoke  and  does  not  cascade  revocation  up¬ 
wards  in  the  role  hierarchy  like  our  strong  revoke  does. 
This  is  reasonable  given  Oracle’s  discretionary  orien¬ 


tation.  Nevertheless,  we  will  see  in  the  next  section 
how  it  is  possible  to  use  Oracle’s  stored  procedures  to 
implement  URA97.  In  this  section  we  briefly  review 
relevant  features  of  Oracle  access  control. 


4.1  Privileges 


Oracle  has  two  kinds  of  privileges,  system  privileges 
and  object  privileges.  System  privileges  authorize  ac¬ 
tions  on  a  particular  type  of  object  for  example  create 
table,  create  user,  etc.  There  are  over  60  distinct  sys¬ 
tem  privileges.  Object  privileges  authorize  actions  on 
a  specific  object  (table,  view,  procedure,  package  etc.). 
Typical  examples  of  object  privileges  are  select  rows 
from  a  table,  delete  rows,  execute  procedures  etc. 

Who  can  grant  or  revoke  privileges  firom  users  or 
roles?  The  answer  depends  on  various  issues  such 
as  whether  it  is  a  system  or  an  object  privilege,  and 
whether  the  object  is  owned  by  the  user,  etc.  In  order 
to  grant  or  revoke  a  system  privilege  the  user  should 
have  the  admin  option  on  that  privilege  or  the  user 
should  have  GRANT  ANY  J^RJVILEGE  system  priv¬ 
ilege.  In  order  to  grant  or  revoke  an  object  privilege 
a  user  should  own  that  particular  object  or  the  user 
should  have  grant  option  on  the  object  if  it  is  owned 
by  someone  else. 
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4.2  Roles  in  Oracle 

Oracle  provides  roles  (from  Oracle  7.0  onwards) 
for  ease  of  management  of  privilege  assignment.  Sys¬ 
tem  and  object  privileges  can  be  granted  to  a  role. 
A  role  can  be  granted  to  any  other  role  (circular 
granting  is  not  allowed).  Any  role  can  be  granted 
to  any  user  in  the  database.  A  role  can  either 
be  enabled  or  disabled  during  a  session.  This  in¬ 
cludes  both  explicit  and  implicit  roles  that  a  user 
is  a  member  of.  Enabling  a  role  will  implicitly  en¬ 
able  all  the  roles  granted  to  it  directly  or  transi¬ 
tively.  The  system  privileges  related  to  role  man¬ 
agement  are  CREATE_ROLE,  GRANT-ANYJROLE, 
DROP JROLE,  and  DROP-ANY_ROLE. 

Information  about  privileges  assigned  to  a  role  can 
be  obtained  from 

Oracle^s  built-in  views  ROLEJSYS_PRIVILEGES, 
ROLE.TAB_PRIVILEGES, 

and  ROLE_ROLE-PRIVS.  When  a  regular  user  per¬ 
forms  query  on  these  views  these  views  only  show 
information  pertaining  to  the  roles  granted  to  that 
user.  However,  the  Oracle  internal  user  SYS  will 
see  information  about  all  the  roles  through  these 
views.  The  view  SESSION JROLES  provides  informa¬ 
tion  about  roles  that  are  enabled  in  a  session.  The 
view  ROLE_ROLE_PRIVS  shows  information  about 
which  roles  are  directly  assigned  to  ainother  role.  Roles 
inherited  transitively  are  not  shown.  For  example,  if 
role  C  was  granted  to  role  B  and  role  B  to  role  A  the 
ROLE_ROLE_PRJVS  view  will  show  that  B  has  been 
granted  to  A  and  C  to  B,  but  will  not  show  the  implied 
transitive  C  to  A  grant. 

4.3  Procedures,  Functions  and  Packages 

Oracle  provides  a  programmatic  approach  to 
manipulate  database  information  using  procedural 
schema  objects  called  PL/SQL  (Procedural  Lan¬ 
guage/SQL)  program  units.  Procedures,  functions 
and  packages  are  different  types  of  PL/SQL  objects. 
PL/SQL  extends  the  capabilities  of  SQL  by  providing 
some  programming  language  features  such  as  condi¬ 
tional  statements,  loops  etc.  Procedures  are  also  re¬ 
ferred  to  as  stored  procedures. 

A  procedure  is  a  collection  of  instructions  which  can 
be  grouped  together  emd  are  performed  on  database 
objects  to  add,  modify  or  delete  database  information. 
In  order  to  create  a  procedure  a  user  should  have  the 
CREATEJ^ROCEDURE  system  privilege.  A  proce¬ 
dure  can  be  executed  by  a  user  who  owns  it  or  by  a 
user  who  has  execute  privileges  on  it. 


CAN_ASSIGN 


Figure  4:  Entity-Relation  Diagram  for  can-assign 


A  stored  procedure  runs  with  the  privileges  of  the 
user  who  owns  it  and  not  the  user  who  is  executing  it. 
This  feature  gives  great  flexibility  in  enforcing  secu¬ 
rity.  For  example  suppose  we  want  a  user  to  perform 
some  operations  on  a  database  but  we  do  not  want  to 
grant  privileges  explicitly.  Then  one  can  write  a  proce¬ 
dure  embedded  with  necessary  operations,  and  grant 
execute  privileges  on  the  procedure  to  the  user.^ 

Functions  are  very  similar  to  procedures.  The  only 
difference  between  a  function  and  a  procedure  is  that 
a  procedure  call  is  a  PL/SQL  statement  itself,  while 
functions  are  called  as  part  of  an  expression.  A  func¬ 
tion  always  returns  a  value  when  it  is  called. 

Packages  are  PL/SQL  constructs  that  store  related 
objects  together.  A  package  is  essentially  a  named 
declarative  section.  It  can  contain  procedures,  func¬ 
tions,  variables  etc.  A  package  consists  of  two  parts, 
the  specification  part  and  body,  stored  separately  in 
the  data  dictionary.  The  package  specification,  also 
known  as  package  header,  contains  the  information 
about  the  contents  of  the  package.  The  package  body 
contains  code  for  the  subprograms  declared  in  the 
header. 


5  IMPLEMENTING  URA97  IN  OR- 
ACLE 

To  implement  URA97  we  define  Oracle  relations 
which  encode  the  can- assign  and  can-revoke  relations 
of  URA97.  The  can-assign  relation  of  URA97  is  imple- 

^The  privileges  that  are  referenced  in  a  procedure  should 
have  been  explicitly  granted  to  the  user  who  owns  the  procedure. 
Privileges  obtained  by  the  owner  via  a  role  cannot  be  referenced 
in  a  procedure. 
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Table  5:  Oracle  can-assign  Relations  for  PSOl  from  Table  2 
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mented  in  Oracle  as  per  the  entity-relation  diagram  of 
figure  4.  We  assume  that  the  prerequisite  condition  is 
converted  into  disjunctive  normal  form  using  standard 
techniques.  Disjunctive  normal  form  has  the  following 
structure. 

(. . .  A  . . .  A  . . .  A  . . .)  V  (. . .  A  . . .  A  . . .  A  . . .)  V  . . .  V 
(...A...  A...  A...) 

Each  ...  is  a  positive  literal  x  or  a  negated  literal  x. 
Each  group  ,  A  . . .  A  . . .  A  . . .)  is  called  a  disjunct. 
For  a  given  prerequisite  condition  can-assign2  has  a 
tuple  for  each  disjunct.  All  positive  literals  of  a  single 
disjunct  are  in  can-assign3,  while  negated  literals  are 
in  can-assign4^ 

The  four  PSOl  tuples  of  table  2  are  represented 
by  this  scheme  as  shown  in  table  5.  The  prerequisite 
conditions  in  this  case  all  have  a  single  disjunct.  An 
example  with  multiple  disjuncts  is  shown  in  table  6. 

The  can-revoke  relation  of  URA97  is  represented 
by  a  single  Oracle  relation.  For  example  table  3  is 
represented  as  shown  in  table  7. 

The  can- assign^  can- assign^  can- assign,  can- assign, 
and  can-revoke  relations  are  owned  by  the  DBA  who 
also  decides  what  their  content  should  be.  In  addition 
we  have  three  accompanying  procedures  and  a  pack¬ 
age  to  support  these.  There  is  one  procedure  each 
for  assigning  a  user  to  a  role,  doing  a  weak  revoke  of 
membership  and  doing  a  strong  revoke  of  membership, 
respectively  as  follows. 

•  ASSIGN 

•  WEAK_REVOKE 

•  STRONG-REVOKE 

Execute  privilege  on  these  procedures  is  given  to  all 
administrative  roles.  We  achieve  this  by  introducing 
a  junior-most  administrative  role,  say  GSO  (generic 
security  officer),  and  assigning  it  the  permission  to 
execute  these  procedures. 


These  relations  and  accompanying  procedures  and 
packages  are  owned  by  the  DBA.  Our  implementation 
also  maintains  an  audit  relation  which  keeps  a  log  of 
all  attempted  assignment  and  revoke  operations  and 
their  outcome.  The  audit  relation  is  also  owned  by  the 
DBA. 

Oracle  does  not  provide  convenient  primitives  for 
testing  whether  or  not  a  user  is  an  implicit  member 
of  a  particular  role.  Testing  explicit  membership  is 
straightforward  since  explicit  membership  is  encoded 
as  a  tuple  in  Oracle’s  system  relations.  To  test  implicit 
membership,  however,  we  need  to  chase  the  role  hier¬ 
archy.  Oracle  also  does  not  provide  direct  support  for 
enumerating  roles  in  a  range  set.  We  built  a  PL/SQL 
package  to  support  these  requirements  and  assist  in 
writing  our  stored  procedures,  as  discussed  below. 

One  of  the  problem  we  encountered  was  the  inabil¬ 
ity  for  a  stored  procedure  to  determine  which  roles 
have  been  turned  on  in  a  given  session.  Let  us  say 
Alice  is  a  member  of  the  SSO  role  in  our  running  ex¬ 
ample.  This  gives  her  implicit  membership  in  all  ad¬ 
ministrative  roles.  In  RBAC96  Alice  should  be  able 
to  decide  which,  if  any,  of  these  administrative  roles 
to  turn  on  in  a  given  session.  Oracle  allows  turn¬ 
ing  roles  on  and  off  in  this  manner.  Unfortunately 
when  Alice  invokes  a  stored  procedure  there  is  no 
means  to  determine  from  within  the  stored  procedure 
as  to  which  roles  Alice  has  turned  on  in  that  particu¬ 
lar  session.  This  is  a  major  obstacle  in  implementing 
URA97  in  Oracle.  In  fact  this  problem  arises  for  all 
kinds  of  extensions  that  could  be  proposed  for  Ora¬ 
cle  RBAC  via  stored  procedures.  The  problem  arises 
because  when  a  stored  procedure  is  created  the  code 
and  execution  path  of  queries  in  the  procedure  are 
compiled  and  stored  within  the  database.  So  when 
a  stored  procedure  is  called  it  is  not  possible  to  de¬ 
termine  which  roles  are  turned  on  in  that  session  be¬ 
cause  the  Oracle  SESSION -ROLES  view  is  based  on 
the  current  session  running  and  it’s  execution  path  can 
not  be  predefined.  The  standard  Oracle  technique  for 
finding  the  roles  of  a  session  returns  the  empty  set 
if  invoked  within  a  stored  procedure.  We  are  told 
that  Oracle  is  aware  of  this  problem  and  may  have 
a  fix  in  future  releases.  However,  in  the  interim,  we 
can  overcome  this  problem  by  using  a  suitable  Ora¬ 
cle  GUI  front  end  tool  like  Oracle  Forms  or  by  using 
Oracle  Call  Interface  (an  API  tool).  In  both  cases 
we  can  use  IS-ROLE-ENABLED  function  to  deter¬ 
mine  whether  a  role  is  enabled  and  SET-ROLE  pro¬ 
cedure  for  enabling  a  role.  These  functionalities  are 
part  of  an  Oracle  Package  called  DBMS-SESSION. 
Unlike  the  security  behavior  of  stored  procedures,  all 
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the  procedures  in  the  DBMS-SESSION  package  are 
run  with  privileges  of  the  invoking  user  (and  not  priv¬ 
ileges  of  the  procedure  or  package  owner).  We  can  call 
these  procedures  first  from  a  front  end  tool  or  Oracle 
Call  Interface  program,  enable  the  proper  roles  via 
IS -ROLE-ENABLED  and  SET-ROLE,  and  then  call 
LRA97  procedures  for  assigning  or  revoking  roles  to 
a  user.  Of  course,  all  of  this  will  happen  transparent 
to  the  end  user. 

In  our  implementation  of  URA97  a  user  invokes  the 
stored  procedure  to  grant  or  revoke  a  role  from  or  to 
another  user.  The  procedure  calls  are  then  as  follows. 

•  ASSIGN(user,  trole,  arole) 

•  WEAK-REVOKE(user,  trole,  arole) 

•  STRONG_REVOKE(user,  trole,  arole) 

The  parameters  user  and  trole  (target  role)  specify 
which  user  is  to  be  added  to  trole,  or  to  be  weakly 
or  strongly  revoked  from  trole.  The  arole  parameter 
specifies  which  administrative  role  should  be  applied 
(with  respect  to  the  user  who  is  invoking  the  URA97 
procedure).  We  have  included  the  arole  parameter 
as  a  partial  fix  to  the  obstacle  discussed  above.  The 
procedure  code  will  of  course  check  whether  or  not  the 
user  who  calls  the  procedure  is  actually  a  member  of 
arole.^ 

All  the  three  procedures  follow  three  basic  steps. 

1.  If  the  user  executing  the  procedure  is  an  explicit 
or  implicit  member  of  arole  then  proceed  to  step 
2,  else  stop  execution  and  return  an  error  message 
indicating  this  is  not  an  authorized  operation. 

2.  The  tuple(s)  from  can-assign  (for  assign  proce¬ 
dure)  or  can-revoke  (for  revocation  procedures) 
are  obtained  where  AR  role  value  equals  or  is  ju¬ 
nior  to  the  arole  parameter  specified  in  the  pro¬ 
cedure  call. 

3.  If  trole  is  in  the  specified  range  for  any  one  of  the 
tuples  selected  in  step  2,  then  assign  or  revoke  the 
trole  else  return  an  appropriate  error  message. 

In  case  of  ASSIGN  also  check  whether  the  user 
being  assigned  to  trole  satisfies  the  prerequisite 
condition  specified  in  the  authorizing  can-assign 
tuple  or  not. 

In  case  of  STRONG  .REVOKE  the  operation  may 
still  fail  due  to  all-or-nothing  semantics. 

is  relatively  straightforward  to  specify  a  set  of  adminis¬ 
trative  roles  instead  of  a  single  arole,  and  we  plan  to  extend  our 
implementation  to  do  that. 


The  implementation  of  steps  1  and  3  involves  complex 
queries  built  on  Oracle  internal  tables.  These  queries 
are  performed  dynamically  at  runtime.  In  order  to 
check  whether  the  user  is  a  member  of  arole  (in  step 
1)  and  whether  the  role  is  in  the  specified  range  for 
one  of  the  relevant  can-assign  or  can-revoke  tuples  (in 
step  3),  we  use  Oracle  CONNECT  BY  clause  in  our 
queries.  By  using  CONNECT  BY  clause,  one  can  tra¬ 
verse  a  tree  structure  corresponding  to  the  role  hier¬ 
archy  in  one  direction.  One  can  start  from  any  point 
within  the  role  hierarchy  and  traverse  it  towards  ju¬ 
nior  or  senior  roles.  But  there  is  no  control  on  the  end 
point  of  the  traversal.  Specific  branches  or  an  individ¬ 
ual  node  of  the  tree  can  be  excluded  by  hard  coding 
their  values.  Such  hard  coding  is  not  appropriate  for  a 
general  purpose  stored  procedure.  In  our  implementa¬ 
tion  we  overcome  this  problem  by  performing  multiple 
queries  and  intersecting  them  to  get  the  exact  range. 
We  specifically  do  not  hard  code  any  parameters  in 
our  queries. 

In  order  to  modularize  our  implementation  we  de¬ 
veloped  a  package  which  performs  the  necessary  checks 
involved  in  steps  1  and  3.  All  the  procedures  call  this 
package  to  do  the  verification.  The  package  contains 
several  functions.  Each  one  is  designed  to  perform 
certain  tasks,  for  example  we  have  a  function  called 
user  JiasMdminjr ole.  This  function  takes  the  param¬ 
eters  from  the  procedure  which  has  called  it  and  re¬ 
turns  the  results  to  the  calling  procedure.  There  are 
other  functions  which  determine  the  range  for  a  given 
arole. 

Our  implementation  is  convenient  for  the  DBA 
since  the  stored  procedures  and  packages  we  provide 
are  generic  and  can  be  reused  by  other  databases.  The 
DBA  only  needs  to  define  the  roles  and  administrative 
roles,  and  configure  the  can- assign  and  can-revoke  re¬ 
lations.  Our  implementation  is  available  in  the  public 
domain  for  other  researchers  and  practitioners  to  ex¬ 
periment  with. 


6  CONCLUSION 

In  this  paper  we  have  developed  the  URA97  model 
for  assigning  users  to  roles  and  revoking  users  from 
roles.  URA97  is  defined  in  context  of  the  RBAC96 
model  [SCFY96].  However,  it  should  apply  to  almost 
any  RBAC  model,  including  [FCK95,  Gui95,  GI96, 
HDT95,  N095],  because  user-role  assignment  is  a  ba¬ 
sic  administrative  feature  which  will  be  required  in 
any  RBAC  model. 
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Authorization  to  assign  and  revoke  users  to  and 
from  roles  is  controlled  by  administrative  roles.  The 
model  requires  users  must  previously  satisfy  a  desig¬ 
nated  prerequisite  condition  (stated  in  terms  of  mem¬ 
bership  and  non-membership  in  roles)  before  they  can 
be  enrolled  via  URA97  into  additional  roles.  URA97 
applies  only  to  regular  roles.  Control  of  membership 
in  administrative  roles  remains  entirely  in  hands  of 
the  chief  security  officer.  We  have  identified  strong 
and  weak  revocation  operations  in  URA97  and  have 
defined  their  precise  meaning. 

The  paper  has  also  described  an  implementation  of 
URA97  using  Oracle  stored  procedures.  Oracle’s  built 
in  primitives  are  cumbersome  to  use  for  determining 
indirect  membership  in  roles.  We  have  implemented 
suitable  functions  and  packages  to  enable  this  conve¬ 
niently.  These  should  be  of  use  to  other  researchers 
and  practitioners  and  are  available  in  the  public  do¬ 
main. 

A  significant  hurdle  we  encountered  is  that  Oracle 
does  not  allow  a  stored  procedure  to  determine  the 
roles  that  are  turned  on  in  a  given  session.  This  is 
a  general  problem  of  Oracle  that  will  arise  whenever 
we  try  to  extend  Oracle  RBAC  via  stored  procedures. 
In  our  implementation  we  require  the  user  to  spec¬ 
ify  these  roles  explicitly  when  the  stored  procedure  is 
called.  As  discussed  this  could  be  made  largely  trans¬ 
parent  with  a  suitable  front  end.  Since  most  users  will 
interact  with  Oracle  via  such  a  front  end  this  may  not 
be  a  significant  problem  in  practice. 

In  future  work  we  will  extend  URA97  to  develop 
more  comprehensive  role-based  administrative  mod¬ 
els  encompassing  administration  of  role-permission  as¬ 
signment  and  role-role  relationships.  We  will  also  in¬ 
vestigate  how  URA97  can  be  adapted  for  user-group 
assignment  on  platforms  such  as  Unix  and  Windows 
NT  (including  simulation  of  group  hierarchies  which 
neither  product  provides).  More  generally  we  feel  our 
w'ork  will  inspire  other  researchers  and  developers  to 
investigate  administrative  models  in  a  systematic,  sci¬ 
entific  and  experimental  approach.  We  feel  the  secu¬ 
rity  community  has  much  to  gain  by  pursuing  such 
work. 
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Advanced  Internet  Search  Tools:  Trick  or  Threat? 
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Current  Internet  search  tools,  e.g.,  Yahoo!  and 
AltaVista,  are  relatively  simple.  Their  reliance  on 
indexed  files  containing  key  word- to-IP- address  map¬ 
pings  limits  them  to  handling  low-level  keyword 
queries.  Future  Internet  search  tools  will  be  much 
more  sophisticated.  They  will  employ  metadata  repos¬ 
itories  to  support  content- based  querying  and  dis¬ 
tributed,  persistent  agents  performing  a  variety  of 
functions,  including  data  gathering,  metadata  extrac¬ 
tion.  data  mining  and  information  fusion.  Users  could 
create  swarms  of  persistent  search  agents  that  would 
range  the  Internet  in  response  to  sophisticated  queries, 
keeping  them  informed  about  updates  and  terminat¬ 
ing  only  on  explicit  user  directives.  Clearly,  such 
.'^rarch  engines  will  pose  serious  threats  to  security  and 
privacy. 

'I  his  paper  describes  the  architecture  of  an  ad¬ 
vanced  search  engine  being  developed  at  the  Univer¬ 
sity  of  Tulsa  to  evaluate  security  and  privacy  threats. 
'I  hr  server  houses  a  metadata  repository,  a  base  agent 
and  various  search  agents. 

'flic  metadata  repository  maintains  schema  in¬ 
formation  about  information  repositories,  including 
structured,  semi-structured  and  unstructured  sources, 
Ji  is  continually  refreshed  by  metadata  daemons,  per¬ 
sistent  agents  that  search  for  new  information  sources, 
old  sources  that  are  no  longer  accessible  and  those 
whose  schemas  have  been  modified. 

'Flic  base  agent  manages  all  queries  submitted  to 
the  search  engine.  It  analyzes  queries  and  determines 
the  appropriate  number  and  type  of  search  agents 
needed.  The  base  agent  can  start  new  search  agent 
til  reads  or  spawn  persistent  search  agents.  A  new 
thread  is  created  for  each  non-persistent  query.  A  per¬ 
sistent  query,  e.g.,  one  involving  a  changing  or  evolving 
information  source,  requires  the  creation  of  a  persis¬ 
tent  agent.  All  search  agents  report  to  their  base  agent 
which  forwards  information  to  the  user. 
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Search  agents  perform  various  functions,  including 
data  gathering,  querying,  data  mining  and  informa¬ 
tion  fusion.  These  agents  access  the  metadata  reposi¬ 
tory  for  intensional  (schema)  information  about  infor¬ 
mation  sources.  Separate  translation  agents,  one  for 
each  type  of  information  source,  are  used  to  obtain 
actual  data.  Search  agents  transmit  results  to  their 
base  agent.  In  addition,  they  provide  progress  reports 
conveying  the  number  of  sources  searched,  the  number 
remaining  to  be  searched  and  the  estimated  comple¬ 
tion  time.  Data  access  failures  are  reported  and  in¬ 
accessible  sources  are  flagged  for  possible  elimination 
from  the  metadata  repository.  A  user  maintains  con¬ 
trol  of  search  agents  through  their  base  agent,  enabling 
searches  to  be  tuned,  suspended  or  terminated. 

Translation  agents  enable  search  agents  to  access 
heterogeneous  information  sources.  Requests  from 
search  agents,  expressed  in  the  Knowledge  Query  Ma¬ 
nipulation  Language  (KQML),  are  translated  into  the 
native  languages  of  information  sources.  Similar  infor¬ 
mation  sources,  e.g.,  databases  using  the  Java/Open 
Database  Connectivity  (JDBC/ODBC)  Interface,  re¬ 
quire  a  single  translation  agent.  Translation  agents 
can  reside  on  the  server  or  on  machines  hosting  in¬ 
formation  sources.  Information  source  administrators 
wishing  to  facilitate  and/or  control  access  by  search 
agents  may  choose  to  provide  their  own  local  transla¬ 
tion  agents. 

Most  search  engine  users  will  use  a  web  browser 
to  access  a  GUI  and  query  processor  implemented  by 
Java  applets,  similar  to  current  search  engines.  How¬ 
ever,  a  copy  of  the  base  agent  may  be  downloaded 
and  run  locally  on  a  remote  machine.  The  ability 
to  remotely  execute  base  agents  enhances  local  and 
global  performance.  Note  that  the  implementation  of 
the  base  agent  as  a  Java  application  allows  full  search 
engine  functionality  unhindered  by  the  Java  applet  se¬ 
curity  model. 
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An  Environment  for  Developing  Securely 
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This  paper  describes  the  implementation  of 
the  Meta-Object  Operating  System  Environment 
(MOOSE)  for  supporting  the  development,  execution 
and  verification  of  secure  heterogeneous  distributed 
systems.  Security  features  in  MOOSE  are  integrated 
at  the  meta-level  within  a  new  coordination  language 
for  heterogeneous  software  components  that  interact 
in  a  distributed  virtual  machine. 

MOOSE  s  hierarchical  operational  and  verification 
frameworks  blend  formal  methods  and  object  technol¬ 
ogy.  The  foundation  of  MOOSE  is  provided  by  the 
Robust  Object  Calculus  (ROC),  a  process  calculus 
for  modeling  and  reasoning  about  distributed  objects. 
The  Meta-Object  Model  (MOM)  defined  with  ROC 
is  a  primitive  distributed  object  architecture  for  con¬ 
structing  sophisticated  object  models  and  program¬ 
ming  languages.  MOM  implements  a  capabilities- 
based  security  model  of  access  control  for  distributed 
objects.  Capabilities,  which  are  unforgeable  tokens, 
are  modeled  in  ROC  by  unique  names  that  are  not 
visible  and  cannot  be  reproduced. 

We  have  used  MOM  to  design  an  object-based  co¬ 
ordination  language,  Mumbo,  for  orchestrating  the 
secure  interoperability  of  heterogeneous  resources  in 
open  systems.  Mumbo  employs  wrapper  technology 
and  abstract  specifications  to  integrate  native  compo¬ 
nents,  while  translators  provide  mappings  from  high- 
level  languages  to  ROC,  permitting  source-level  inte¬ 
gration.  Mumbo  uses  the  MOM  security  model  to  sup¬ 
port  Discretionary  Access  Control  (DAC)  for  software 
components.  R  also  provides  new  language  constructs 
for  constraining  class  and  object  protocols,  giving  de¬ 
velopers  more  control  over  component  communication 
patterns. 


The  ROC  Virtual  Machine  (ROCVM)  has  been 
developed  in  Java  to  execute  (reduce)  ROC  expres¬ 
sions,  simulating  and  executing  applications  in  het¬ 
erogeneous  distributed  environments.  ROCVM’s  pri¬ 
mary  security  responsibility  is  to  protect  unique  names 
that  model  capabilities  in  MOM’s  security  architec¬ 
ture.  Users  cannot  reproduce  unique  names  or  even 
view  them  without  permission.  ROCVM  provides  an 
interactive  graphical  interface  for  intuitive  visualiza¬ 
tion  and  analysis  of  systems.  The  Java  implemen¬ 
tation  permits  operation  on  heterogenous  platforms. 
ROCVM  is  designed  for  distributed  operation,  provid¬ 
ing  multiple  viewports  for  users  into  a  given  system. 
Users  can  interact  with  ROCVM  by  adding  ROC  ex¬ 
pressions  to  executing  systems  on  the  fly.  Verification 
tools  are  also  integrated  into  ROCVM,  making  high 
assurance  for  heterogeneous  distributed  systems  more 
practical. 
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In  this  paper,  we  will  set  up  MLS  decision  logic.  Based  on  this  logic,  then  we  can  formalize  and  then 
discuss  inference  rules  in  information  tables  which  include  view  and  relation  instances  in  relational 
databases.  Roughly,  a  formula  in  the  logic  is  meant  to  describe  "object"  in  information  tables.  Objects  may 
have  the  same  description:  thus  formulas  may  describe  subsets  of  objects.  For  single  level  version  we  refer 
readers  to  (Palwak,  1991). 

The  Syntax  of  a  MLS  DL-language 

Alphabet 

a)  T  -  The  set  of  attribute  names 

b)  y=  u  dom  (A)  -  The  set  of  attribute  values  of  A  e  T,  called  active  domain  of  a.(Meyer,  1983) 

c)  i={  A,  V,  =  )-The  set  of  connectives  (negation,  and,  or,  implication,  equivalence) 

d)  C-  A  symbol  to  create  a  "place  holder"  to  hold  security  classes. 

Formulas  £i 

The  smallest  set  satisfying  the  following: 

a)  Expressions  of  the  form,  attribute  value  pair  <  A,  v>,  called  atomic  formulas,  are  formula  of  DL- 
language  for  any  A  e  T  and  v  e  dom(A). 

b)  If  tp  and  T)  are  formulas,  so  are  -tp,  (<pA  1)),  (<pv  ti),  (cp  — >  t)) 

c)  To  each  formula  <p  in  DL-language,  we  associate  a  variable  to  hold  a  security  class,  denoted  by  C(<p). 
The  Semantics  of  a  MLS  DL-language 

MLS  DL-Model 

An  information  table  S=  (U,  T)  (also  known  as  information  system,  knowledge  representation  system) 
consists  of 

(1)  U  =  {u,  v,..}is  a  set  of  entities. 

(2)  T  is  a  set  of  attributes  { A  j ,  A2, ..  A^ } . 

(3)  Dom(Aj)  is  the  set  of  values  of  attribute  Aj. 

Dom  =  dom(Aj)udom(A2)u..  udom(A„). 

(In  databases(Meyer,  1983)  dom  is  commonly  reffered  as  active  domain) 

(4)  p  :  U  X  T  Dom  ,  called  description  function,  is  a  map  such  that 

p(u,  Aj)  is  in  dom(Ai)  for  all  u  in  U  and  Aj  in  T. 
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**  On  leave  from  San  Jose  State  University  (tylin@CS.sjsu.edu) 


159 


An  information  table  is  called  a  relation  if  for  each  u  the  associated  map 
t  =  p(u,  •) :  U  dom(A])  x  dom(A2)  x  ...  xdomCA^);  u  (  p(u,  A]),  p(u,  A2),...,  p(u,  Aj,)) 
is  a  faithful  representation  of  U. 

A  decision  table  is  an  information  table  in  which  the  attribute  set  T  =  C  u  D  is  a  union  of  two  non-empty 
sets,  C  and  D,  of  attributes.  The  elements  in  C  are  called  conditional  attributes.  The  elements  in  D  are 
called  decision  attributes. 

Interpretations 

As  usual  at  each  level,  we  will  denote  u  1=  s  (p  or  u  1=  (p,  when  S  is  understood,  if  an  object  u  €  U  satisfies  a 
formula  tp  in  a  information  table  S=(U,  T).  So  we  will  say  u  f=  (p,  iff 

u  l=<A,  v>  iff  p(u,  A)  =  V 
u  1=  ~(p  iff  non  u  1=  (p 
u  1=  ((pA  r|)  iff  u  1=  (p  and  u  1=  t| 
u  1=  (tp  V  T|)  iff  u  1=  cp  or  u  1=  T| 

We  have  many  usual  formulas,  such  as 

u  1=  ((p  Tl)  iff  U  1=  -(p  V  T] 

We  associate  the  formula  <p,  the  following  set 
I  tp  Is  =  {  u  :  u  €  U  and  u  1=  s  <p  }. 

It  will  be  called  the  meaning  of  tp.  A  formula  is  said  to  be  true  if  1  tp  Is  =  U;  tp  is  logically  equivalent  to  iff 
their  meanings  are  the  same,  i.e.,  Itp  Is  =  I  Tils.  All  formula  and  their  meanings  are  properly  classified. 

Monotonic  assumption.  We  will  assume  at  each  level  (the  level  and  its  dominated  levels)  the  Universe  U  is 
•  well  defined,”  often  we  may  need  to  use  to  denote  the  level  L.  We  will  assume  c  U^,  where  L  <  H 


The  Deductive  System  of  a  MLS  DL-language 

At  each  security  level  (more  precisely,  the  level  and  its  dominated  levels),  we  have 
The  inference  rules:  Modus  ponens  is  the  only  rule. 

The  axioms 

( 1 )  The  set  of  propositional  tautologies 

(2)  Specific  axioms: 

(a)  <A,  v>  A  <A,  u>  =  0  for  any  A  e  T  and  v,  u  e  V  and  v  u 

(b)  V (  <A,  v>  :  for  every  v  e  dom(A)  and  for  every  A  €  T)  s  1 

(c)  -  <A,  v>  =  V{  <  A,  u>  :  for  every  u  e  dom(A)  and  for  every  A  e  T,  v  ;ii  u  ) 

We  need  few  auxiliary  notations  and  results:  Let  0  and  1  denote  falsity  and  truth  at  every  security  level 
Formula  of  the  form 
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<Ai  ,  Vi>  A  <  A2,  V2>  Aj,,  Vn> 

is  called  P-basic  formula  or  P-formula,  where  Vj  e  dom  (Aj),  and  P=  {  Ai,  A2,  ..  A^  }.  For  P  =  T,  P-basic 

formulas  will  be  called  basic  formulas.  The  set  of  all  basic  formulas  satisfiable  in  S  is  called  basic 
knowledge  in  S.  The  specific  Axiom  a)  follows  from  the  assumption  that  each  entity  can  have  exact  one 
value  in  each  attribute.  The  Axiom  b)  must  take  one  of  the  value  of  its  domain.  This  is  saying  that  dom(A) 
is  the  active  domain  of  attribute  A.  The  axiom  c)  allow  us  to  get  rid  of  the  negation  in  such  a  way  that 
instead  of  saying  that  an  object  does  not  possesses  a  given  property  we  can  say  that  it  has  one  of  the 
remaining  properties.  It  implies  the  closed  word  assumption.  Let  (P),  or  simply  X  (P)  denote  the 

disjunction  of  all  P-basic  formulas  satisfied  in  S.  At  each  level,  the  closed  word  assumption  can  be  express 
in  the  following  (Pawlak,  1991). 

Proposition  1=5  Z  5  (P)  =  1 .  For  any  P  g  T. 

A  formula  (p  is  a  theorem,  denoted  by  I-  (p ,  if  it  is  derivable  from  the  axioms. 

At  each  level,  the  set  of  theorems  of  DL-logic  is  identical  with  the  set  of  theorems  of  classical  propositional 
calculus  with  specific  axioms  (a)-  (c).  Computational  rules  of  security  classes  of  formulas  with  respect  to 
logic  connectives  will  be  discussed  in  the  full  paper. 
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